Dynamic characterization of synthetic genetic circuits in living cells

ABSTRACT

The present invention relates to a method for determining one or more intrinsic properties of a DNA component from a plurality of measurements obtained over a time period from a cell culture, with each cell comprising the DNA component, wherein the DNA component is involved in transcription of one or more target signals, wherein the plurality of measurements comprises measurements relating to the density of the cell culture over the time period and measurements relating to the amount of the one or more target signals in the cell culture over the time period.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation application of and claims priority toU.S. patent application Ser. No. 16/252,561, entitled “DYNAMICCHARACTERIZATION OF SYNTHETIC GENETIC CIRCUITS IN LIVING CELLS,” filedon Jan. 18, 2019, which claims priority to GB Application No. 1810636.9,entitled “DYNAMIC CHARACTERIZATION OF SYNTHETIC GENETIC CIRCUITS INLIVING CELLS,” filed on Jun. 28, 2018, the disclosures of which areincorporated herein by reference in their entireties.

BACKGROUND

Synthetic biology includes designing and building artificial biologicalsystems to aid in the investigation of how biological systems (andcomponents of those systems) work, and to perform useful functions, suchas the production of biological medicinal products or the conversion ofwaste material into valuable chemical products.

However, due to the complexity of biological systems, it has beendifficult to establish methods that can be applied to characterise theircomponents, which would then allow new systems with predictablebehaviour to be designed. As such, a lot of effort has to be put into invitro testing and modification of new systems in order to optimise them.

Generating an artificial biological system generally involvesintroducing anew genetic circuit into a host, such as a bacterial cell.The cell's innate machinery is then utilised to perform a new functionbased on the genetic circuit. Synthetic genetic circuits are constructedvia genetic engineering from DNA-based components, e.g. protein coding,promoter and enhancer sequences etc. In general, the behaviour of thesecomponents in a living system is currently not predictable from theirsequence. Additionally, quantitative characterisation is complicated bythe activity of components depending on the concentration of regulatorsand the metabolic activity of the cell, both of which change over time.As such, when observing the performance of a particular genetic circuitin a cell, it is difficult to separate the intrinsic properties of theDNA-based component (e.g. the ability of a promoter to recruitpolymerase) from extrinsic factors (e.g. cellular metabolic state,ribosome availability). Knowledge of the intrinsic properties ofDNA-based components would be useful in the process of designing newgenetic circuits with predictable behaviour.

Many methodologies have been developed in the past to infer parametersof ordinary differential equation (ODE) models of genetic componentbehaviour, but the majority of these methods do not take into accountthe metabolic state of the cell.

Recently it has been shown that variations in genetic circuit activityresulting from metabolic differences between cell cultures can bemitigated using so-called ratiometric methods. These methods divide thetarget output (e.g. produced protein) of a new genetic circuit underinvestigation, by the output of a constitutive control genetic circuit,whose activity depends on the metabolic state of the cell in anequivalent way to the target (Yordanov et al., ACS Synthetic Biology,3(8):578-588, 2014; Rudge et al., ACS Synthetic Biology, 5(1):89-98,2016). In particular, Yordanov et al., describe a computational methodfor automated characterisation of genetic components (ACS SyntheticBiology, 3(8) 3(8):578-588, 2014) which uses two fluorescent reportersignals, one to monitor the transcriptional activity of a target geneticcircuit and another to monitor the activity of a control genetic circuitin which gene expression is under the control of a constitutivepromoter. The authors directly combine the fluorescence signals into asingle time-series, which is then mapped to a single quantity byaveraging within a time window of interest. For example, while the cellsare in exponential growth or alternatively in stationary phase.

However, these ratiometric methods rely on measurements during timewindows in which activity of the genetic circuit is assumed to beconstant. Therefore, these methods fail to capture dynamic behaviours.

Del Vecchio and colleagues created a library of genetic activationcascades in E. coli bacteria, in which they explicitly tune the resourcedemand of each gene, and show that resource competition shapes theresponse of genetic circuits (Qian et al., ACS Synth Biol. (2017); 6(7):1263-1272).

Therefore, there remains a need for a method for characterising theintrinsic properties of genetic components of genetic circuits.

SUMMARY

According to one aspect the present invention provides a method fordetermining one or more intrinsic properties of a DNA component, from aplurality of measurements obtained over a time period from a cellculture, with each cell comprising the DNA component, wherein the DNAcomponent is involved in transcription of one or more target signals,wherein the plurality of measurements comprises measurements relating tothe density of the cell culture over the time period and measurementsrelating to the amount of the one or more target signals in the cellculture over the time period, the method comprising:

-   -   (a) estimating parameter values for a first mathematical model        that describes cell culture density over time, by minimizing a        measure of the difference between the first mathematical model        output and the measurements relating to the density of the cell        culture over the time period;    -   (b) estimating parameters quantifying the intrinsic properties        of the DNA component, by embedding these parameters within a        further mathematical model that describes the production of the        one or more target signals over time, and by minimizing a        measure of the difference between the model outputs and the        measurements relating to the amount of the one or more target        signals, wherein the further mathematical model additionally        uses the parameter values estimated in (a) or parameter values        based thereon.

The present invention improves on existing methods by providing a methodthat allows a DNA component, which is involved in transcription of oneor more target signals in a synthetic genetic circuit, to becharacterised dynamically. As a result, genetic circuits containing thisDNA component can be designed and optimised in silico before having tobe prepared in vitro, leading to time and cost savings.

In a further aspect, the present invention provides a computer programproduct embodied on a computer readable storage and comprising codewhich is configured so as to perform the operations of theabove-described method when run on a computer system.

In a still further aspect, the present invention provides a method ofoptimizing expression of at least one gene comprised in a geneticcircuit, wherein the genetic circuit further comprises a DNA componentwhich is involved in transcription of the at least one gene, wherein themethod comprises: (1) determining one or more intrinsic properties ofthe DNA component using the method described above; (2) using the one ormore intrinsic properties of the DNA component determined in (1) tosimulate expression of the at least one gene from the genetic circuit inat least two different arrangements of the genetic circuit; (3)selecting the arrangement in (2) that results in optimal expression ofthe at least one gene; and (4) making the arrangement of the geneticcircuit selected in step (3).

BRIEF DESCRIPTION OF THE DRAWINGS

To assist understanding of the present disclosure and to show how it maybe put into effect, reference is made, by way of example only, to theaccompanying drawings in which:

FIGS. 1A, 1B and 1C show diagrams to visually depict an example of aratiometric method according to the invention. FIG. 1A. Feedback loopsbetween inserted genetic components and the growth and resources of thehost cell are replaced with an open loop scheme that permits parameteridentification. FIG. 1B. The parameter identification for each syntheticcircuit comprises three inference problems, that can proceedsequentially or in parallel. Furthermore, characterisation of multiplecircuits can be composed into a dependency graph, where circuits withcommon components can reuse parameters characterised previously. FIG.1C. The computational graph of panel FIG. 1B can be condensed.

FIGS. 2A and 2B show relay circuit network diagrams. FIG. 2A. TheP_(Las)-LuxI relay device, with double reporter. FIG. 2B. TheP_(Lux)-LasI device, with double reporter.

FIG. 3 shows circuit network diagram for the AHL lactonase module.

FIGS. 4A to D show a graphical depiction of dynamic characterisationmethods of the invention. FIG. 4A. Feedback loops between insertedgenetic components and the growth and resources of the host cell arereplaced with an open loop scheme that permits parameter identification.FIG. 4B. Flow diagrams illustrate connections between observation data(grey boxes) and parameter sets (circle nodes) for multiple circuits.FIG. 4C. Flow diagrams illustrate connections between observation data(grey boxes) and parameter sets (circle nodes) for multiple circuits.FIG. 4D. The computational graph of FIGS. 4B and 4C can be condensed.

FIG. 5A to F shows the modular design and characterisation of HSLsignalling components. The synthetic gene circuit and correspondingmodule calls within chemical reaction network (CRN) program for eachcircuit are shown. FIG. 5A. Autofluorescence characterised with cellsexpressing no CFP or YFP, but measured at 480 nm and 530 nm in responseto varying concentrations of EtOH. FIG. 5B. Fluorescent proteindegradation characterised using CFP and YFP with a constitutivepromoter. FIG. 5C. HSL double receiver characterised by measuring 4variant circuits with different ribosome binding sites for LuxR and LasRexpression and treated with C6-HSL or C12-HSL. FIG. 5D. HSL senders werecharacterised inducibly expressing LasI in response to differentconcentrations of C6-HSL via PLux76. FIG. 5E. The PBad promoter wascharacterised by inducing expression of YFP with differentconcentrations of arabinose. FIG. 5F. The HSL degrader, AiiA, wascharacterised by inducing its expression via PBad, then measuring theresponse of the double receiver to different concentrations of HSL.

FIG. 5G to Q provide graphs showing example measurements (solid lines(or dots in FIG. 5H) and calibrated model simulations (dashed lines) forthe synthetic circuits represented in FIGS. 5A to 5F. FIG. 5G. CFPfluorescent measurements at varying nM concentrations of EtOH (inten-fold dilutions starting from 1.7×10⁹ nM) from autofluorescencecharacterization as per FIG. 5A. Upper bracket on right hand siderepresents grouping of lines of example measurements and modelsimulations from 1.7×10⁷ to 1.7×10¹, 1.7×10⁻¹ and 0 nM EtOH. The lowertwo solid lines ending within this bracket are example measurements for1.7×10⁸ nM. Lower bracket: upper two dashed lines ending in this bracketrepresent model simulations for 1.7×10⁸ nM. Lower bracket: lower twosolid lines overlapping with dashed lines represent example measurementsoverlapping with model simulations for 1.7×10⁹ nM. FIG. 5H. YFPfluorescent measurements at varying nM concentrations of EtOH (inten-fold dilutions starting from 1.7×10⁹ nM) from autofluorescencecharacterization as per FIG. 5A. Upper bracket on right hand siderepresents grouping of example measurements and model simulations frommost of the EtOH concentrations. Lower bracket: upper pair of dashedlines represent model simulations for 1.7×10⁸ nM. Lower bracket, lowerdashed lines ending in this bracket represent model simulations for1.7×10⁹ nM, which track dots representing example measurements for1.7×10⁹ nM EtOH. FIG. 5I. CFP fluorescent measurements at varying μg/mlconcentrations of chloramphenicol (5 μg/ml; 2.5 μg/ml; 1.25 μg/ml; 0.625μg/ml; 0.1325 μg/ml; and 0 μg/ml) from standard promotercharacterisation as per FIG. 5B. Lines ending in upper bracket on righthand are those representing example measurements and model simulationsfrom 0 μg/ml, 0.1325 μg/ml, and 0.625 μg/ml. Lines ending in the lowerbracket on the right hand side are those representing examplemeasurements and model simulations from 1.25 μg/ml. Arrow points tosolid and dashed lines from 5 μg/ml and 2.5 μg/ml which track along thebottom of the graph. FIG. 5J. YFP fluorescent measurements at varyingμg/ml concentrations of chloramphenicol (5 μg/ml; 2.5 μg/ml; 1.25 μg/ml;0.625 μg/ml; 0.1325 μg/ml; and 0 μg/ml) from standard promotercharacterisation as per FIG. 5B. Top arrow points to examplemeasurements from 1.25 μg/ml chloramphenicol. Lines ending in upperbracket are those representing example measurements and modelsimulations from 0 μg/ml, 0.1325 μg/ml, and 0.625 μg/ml. Dashed linesending in lower bracket are those from model simulations from 1.25μg/ml. Bottom arrow points to solid and dashed lines from 5 μg/ml and2.5 μg/ml which track along the bottom of the graph. FIG. 5K. CFPfluorescent measurements at varying nM concentrations of C6 (25000 nM;8333.3 nM; 2777.8 nM; 925.9 nM; 308.6 nM; 102.9 nM; 34.2 nM; 11.4 nM;3.8 nM; 1.3 nM; 0.4 nM; and 0 nM) from HSL double receivercharacterisation as per FIG. 5C. Top arrow points to solid linesgenerated by example measurements with 25000 nM (darker line) and 8333.3nM (lighter line). Dashed lines ending in upper bracket represent linesgenerated by model simulations with 25000 nM (darker line) and 8333.3 nM(lighter line). Solid line ending in upper middle bracket, and dashedline ending directly beneath it relate to 925.9 nM C6. Solid line endingin middle bracket, and dashed line ending directly beneath it relate to308.6 nM C6. Solid line ending in lower middle bracket, and dashed lineending directly beneath it relate to 102.9 nM C6. Lines ending in bottombracket represent the concentrations beneath 102.9 nM C6. FIG. 5L. YFPfluorescent measurements at varying nM concentrations of C12 (25000 nM;8333.3 nM; 2777.8 nM; 925.9 nM; 308.6 nM; 102.9 nM; 34.2 nM; 11.4 nM;3.8 nM; 1.3 nM; 0.4 nM; and 0 nM) from HSL double receivercharacterisation as per FIG. 5C. Lines ending in top bracket relate to25000 nM, 8333.3 nM, 2777.8 nM, and 925.9 nM C12. Top arrow points tosolid and dashed line from 308.6 nM C12. Upper middle arrow points tosolid and dashed lines from 102.9 nM C12. Middle arrow points to solidand dashed lines from 34.3 nM C12. Lower middle arrow points to solidand dashed lines from 11.4 nM C12. Bottom arrow points to solid anddashed lines generated with all other concentrations. FIG. 5M. CFPfluorescent measurements at varying nM concentrations of C6 (25000 nM;8333.3 nM; 2777.8 nM; 925.9 nM; 308.6 nM; 102.9 nM; 34.2 nM; 11.4 nM;3.8 nM; 1.3 nM; 0.4 nM; and 0 nM) from HSL sender characterisation asper FIG. 5D. Lines ending in upper bracket are solid and dashed linesgenerated by 25000 nM and 8333.3 nM C6. Lines ending in upper middlebracket relate to 2777.8 nM and 925.9 nM C6. Lines ending in middlebracket relate to 308.6 nM C6. Lines ending in lower middle bracketrelate to 102.9 nM C6. Solid and dashed lines ending in lower bracketare generated with the lower concentrations of C6. FIG. 5N. YFPfluorescent measurements at varying nM concentrations of C12 (25000 nM;8333.3 nM; 2777.8 nM; 925.9 nM; 308.6 nM; 102.9 nM; 34.2 nM; 11.4 nM;3.8 nM; 1.3 nM; 0.4 nM; and 0 nM) from HSL sender characterisation asper FIG. 5D. Dashed lines ending in upper bracket relate to 25000 nM,8333.3 nM, 2777.8 nM, 925.9 nM, and 305.6 nM C12. Solid lines ending inmiddle bracket relate to 25000 nM, 8333.3 nM, 2777.8 nM, 925.9 nM, 305.6nM, and 102.9 nM C12. The dashed line ending within this bracket relatesto 102.9 nM C12. Dashed and solid lines at top arrow relate to 34.3 nMC12. Dashed and solid lines at bottom arrow relate to 11.4 nM C12. Linesgenerated by lower concentrations end within lower bracket. FIG. 5O. YFPfluorescent measurements at varying mM concentrations of arabinose (25mM; 12.5 mM; 6.25 mM and 3.13 mM) from arabinose-inducible promotercharacterisation as per FIG. 5E. Top arrow points to solid and dashedlines generated by 25 mM. Upper middle arrow points to solid and dashedlines generated by 12.5 mM. Lower middle arrow points to solid anddashed lines generated by 6.25 mM, and bottom arrow points to solid anddashed lines generated by 3.13 mM. FIG. 5P. CFP fluorescent measurementsat varying mM concentrations of arabinose (5 mM; 2.5 mM; 1.25 mM, 0.625mM, 0.3125 mM and 0 mM) from HSL degrader characterisation as per FIG.5F. Top arrow points to solid line generated by 0.625 mM. Arrow secondfrom top points to dashed line generated by 0.625 mM. Arrow third fromtop points to solid line generated by 0 mM. Arrow fourth from top pointsto dashed line generated by 0 mM. Arrow second from bottom points todashed line generated by 2.5 mM. Bottom arrow points to solid linegenerated by 2.5 mM. FIG. 5Q. YFP fluorescent measurements at varying mMconcentrations of arabinose (5 mM; 2.5 mM; 1.25 mM, 0.625 mM, 0.3125 mMand 0 mM) from HSL degrader characterisation as per FIG. 5F. Top arrowpoints to solid line generated by 1.25 mM. Arrow second from top pointsto dashed line generated by 1.25 mM. Arrow third from top points todashed line generated by 0 mM. Arrow fourth from top points to solidline generated by 0 mM. Arrow second from bottom points to solid anddashed lines generated by 0.3125 mM. Bottom arrow points to solid anddashed lines generated by 2.5 mM.

FIG. 6A bar chart showing maximum log-likelihood score of inferenceproblem of autofluorescent as per FIG. 5A for the 5 hypotheses about thedynamics of gene expression capacity shown in Table 1. (Right hand bars:Ratiometric characterisation method; Left hand bars: Directcharacterisation method.)

FIG. 6B bar chart showing maximum log-likelihood score of inferenceproblem of constitutive promoter gene circuit shown in FIG. 10 for the 5hypotheses about the dynamics of gene expression capacity shown inTable 1. (Right hand bars: Ratiometric characterisation method; Lefthand bars: Direct characterisation method.)

FIG. 6C bar chart showing maximum log-likelihood score of inferenceproblem of AHL double receiver gene circuit shown in FIG. 11 , for the 5hypotheses about the dynamics of gene expression capacity shown inTable 1. (Right hand bars: Ratiometric characterisation method; Lefthand bars: Direct characterisation method.)

FIG. 6D bar chart showing maximum log-likelihood score of inferenceproblem of AHL senders (relays) gene circuits shown in FIGS. 12A and12B, for the 5 hypotheses about the dynamics of gene expression capacityshown in Table 1. (Right hand bars: Ratiometric characterisation method;Left hand bars: Direct characterisation method.)

FIG. 6E bar chart showing maximum log-likelihood score of inferenceproblem of PBad gene circuit shown in FIG. 13 , for the 5 hypothesesabout the dynamics of gene expression capacity shown in Table 1. (Righthand bars: Ratiometric characterisation method; Left hand bars: Directcharacterisation method.)

FIG. 6F bar chart showing maximum log-likelihood score of inferenceproblem of AHL laconase (AiiA) gene circuit shown in FIG. 14 , for the 5hypotheses about the dynamics of gene expression capacity shown inTable 1. (Right hand bars: Ratiometric characterisation method; Lefthand bars: Direct characterisation method.)

FIG. 6G bar chart showing the sum of the maximum log-likelihood score ofthe six inference problems represented in FIGS. 6A to 6F, for the 5hypotheses about the dynamics of gene expression capacity shown inTable 1. (Right hand bars: Ratiometric characterisation method; Lefthand bars: Direct characterisation method.)

FIG. 6H bar chart showing maximum log-likelihood score of thesimultaneous method applied to parametize the six circuits of FIGS. 6Ato 6F, for the 5 hypotheses about the dynamics of gene expressioncapacity shown in Table 1. (Right hand bars: Ratiometriccharacterisation method; Left hand bars: Direct characterisationmethod.)

FIG. 7A graph showing the posterior predictive distribution quantifiedfor direct dynamic characterisation using the graph-based method for the5 hypotheses about the dynamics of gene expression capacity shown inTable 1. The box-plots indicate the interquartile range of samples fromthe marginalized parameter posteriors. The plus symbols indicate themeans of each distribution.

FIG. 7B graph showing the posterior predictive distribution quantifiedfor ratiometric dynamic characterisation using the graph-based methodfor the 5 hypotheses about the dynamics of gene expression capacityshown in Table 1. The box-plots indicate the interquartile range ofsamples from the marginalized parameter posteriors. The plus symbolsindicate the means of each distribution.

FIG. 7C graph showing the posterior predictive distribution quantifiedfor direct dynamic characterisation using the simultaneous method forthe 5 hypotheses about the dynamics of gene expression capacity shown inTable 1. The box-plots indicate the interquartile range of samples fromthe marginalized parameter posteriors. The plus symbols indicate themeans of each distribution.

FIG. 7D graph showing the posterior predictive distribution quantifiedfor the ratiometric dynamic characterisation using the simultaneousmethod for the 5 hypotheses about the dynamics of gene expressioncapacity shown in Table 1. The box-plots indicate the interquartilerange of samples from the marginalized parameter posteriors. The plussymbols indicate the means of each distribution.

FIGS. 7E to J. Graphs showing a comparison of the posterior predictivedistribution for the TargetSwitch model using the direct graph method.HSL treatment indicated above graph. Time of HSL addition: 0 h, 2.5 hand 4.82 h. YFP and CFP data are shown as thick lines, with modelsimulations depicted as the mean as thin lines, and 95% credibilityintervals as the shaded regions marked at right hand edge as region a(for CFP) and region b (for YFP). Thick line and thin line within regiona are CFP data and model lines, respectively. FIG. 7E. HSL treatment:C₆=25000 nM. Thick line and thin line within region b are YFP data andmodel lines—overlapping and not distinguishable from each other on thisgraph. FIG. 7F. HSL treatment: C₆=5000 nM. Thick line and thin linewithin region b are YFP data and model lines—overlapping and notdistinguishable from each other on this graph. FIG. 7G. HSL treatment:C₆=1000 nM. Thick line and thin line within region b are YFP data andmodel lines—overlapping and not distinguishable from each other on thisgraph. FIG. 7H. HSL treatment: C₆=200 nM. Thick line and thin linewithin region b are YFP data and model lines—overlapping and notdistinguishable from each other on this graph. FIG. 7I. HSL treatment:C₆=40 nM. Thick line and thin line within region b are YFP data andmodel lines—overlapping and not distinguishable from each other on thisgraph. FIG. 7J. HSL treatment: C₆=8 nM. Arrow a indicates CFP thick andthin lines—overlapping and not distinguishable from each other on thisgraph. Arrow b indicates YFP thick and thin lines—overlapping and notdistinguishable from each other on this graph.

FIGS. 7K to P Graphs showing a comparison of the posterior predictivedistribution for the TargetSwitch model using the direct graph method.HSL treatment indicated above graph. Time of HSL addition: 0 h, 2.5 hand 4.82 h. YFP and CFP data are shown as thick lines, with modelsimulations depicted as the mean as thin lines, and 95% credibilityintervals as the shaded regions marked at right hand edge as region a(for CFP) and region b (for YFP). Thick line and thin line within regionb are YFP data and model lines, respectively. FIG. 7K. HSL treatment:C₁₂=25000 nM. FIG. 7L. HSL treatment: C12=5000 nM. Thick line and thinline within region a are CFP data and model lines—overlapping and notdistinguishable from each other on this graph. FIG. 7M. HSL treatment:C₁₂=1000 nM. Thick line and thin line within region a are CFP data andmodel lines—overlapping and not distinguishable from each other on thisgraph. FIG. 7N. HSL treatment: C₁₂=200 nM. Thick line and thin linewithin region a are CFP data and model lines—overlapping and notdistinguishable from each other on this graph. FIG. 7O. HSL treatment:C₁₂=40 nM. Thick line and thin line within region a are CFP data andmodel lines—overlapping and not distinguishable from each other on thisgraph. FIG. 7P. HSL treatment: C₁₂=8 nM. Arrow a indicates YFP solidline and arrow b indicates CFP solid line.

FIG. 8 Dependency graph for circuit parameters (white circles) andcollections of circuit measurements (grey boxes enclosed in roundedrectangles). Each inference task is indicated by a solid arrow,representing the approximation of P (0|Data, H) where H is the modellinghypothesis being used.

FIG. 9 shows circuit design of chromosomal RFP.

FIG. 10 shows design of constitutive expression circuit.

FIG. 11 shows the design of an AHL double receiver (DR) gene circuit.

FIGS. 12A and 12B show relay circuit network diagrams. FIG. 12A. ThePo_(Las)-LuxI relay device, with double reporter. FIG. 12B. ThePo_(Lux)-LasI device, with double reporter.

FIG. 13 shows the design of PBAD gene circuit.

FIG. 14 shows the circuit network diagram for AHL lactonase module.

DETAILED DESCRIPTION

As discussed above, the present invention provides a method fordetermining one or more intrinsic properties of a DNA component, from aplurality of measurements obtained over a time period from a cellculture, with each cell comprising the DNA component, wherein the DNAcomponent is involved in transcription of one or more target signals,wherein the plurality of measurements comprises measurements relating tothe density of the cell culture over the time period and measurementsrelating to the amount of the one or more target signals in the cellculture over the time period, the method comprising:

-   -   (a) estimating parameter values for a first mathematical model        that describes cell culture density over time, by minimizing a        measure of the difference between the first mathematical model        output and the measurements relating to the density of the cell        culture over the time period;    -   (b) estimating parameters quantifying the intrinsic properties        of the DNA component, by embedding these parameters within a        further mathematical model that describes the production of the        one or more target signals over time, and by minimizing a        measure of the difference between the model outputs and the        measurements relating to the amount of the one or more target        signals, wherein the further mathematical model additionally        uses the parameter values estimated in (a) or parameter values        based thereon.

The method can be used to characterise one or more intrinsic propertiesof a DNA component which is involved in transcription. In an expressionsystem, such as in the situation where the DNA component is a promoterdriving expression of a gene in a bacterial cell culture, the level ofgene transcription will depend not only on the nature of the DNAcomponent itself, but also on external (or extrinsic) factors such asavailability of the cell's transcription machinery or even DNA sequenceslocal to the DNA component. In the context of the present inventionintrinsic properties of the DNA component are those that arise from theDNA component itself. In particular, intrinsic properties are those thatresult from the sequence of the DNA component, any modifications of theDNA component (such as methylation), and any 3D structure of the DNAcomponent. The DNA component may be a regulatory sequence which iscapable of regulating transcription of a gene. The DNA component can bea promoter or an enhancer. An intrinsic property of such a component maybe the ability of the component to recruit polymerase, to bindtranscription factors, or to regulate the transcriptional product of agene.

In an embodiment, the method further comprises obtaining themeasurements referred to above by culturing cells comprising the DNAcomponent and taking the measurements over the time period.

The method comprises the use of a plurality of measurements obtainedover a time period from a cell culture (or cell colony), wherein eachcell comprises the DNA component. The DNA component is involved intranscription of one or more target signals. In particular, the DNAcomponent is part of a genetic circuit from which the one or more targetsignals are transcribed. This genetic circuits can be integrated in thechromosomes of the cells of the culture. Alternatively, this geneticcircuit can be on separate genetic elements such as plasmids,bacteriophage, cosmids, bacterial artificial chromosomes, yeastartificial chromosomes or mammalian artificial chromosomes. Inparticular, the genetic circuit can be on a plasmid. The copy number ofthe genetic circuit comprising the DNA component may be known orunknown.

The one or more target signals may be RNA or protein signals. Inparticular, it is desirable for the level of the one or more targetsignal to be measured repeatedly over the time period withoutsignificantly interfering with or disrupting the growth of the cellculture. Where the one or more target signals are RNA they may bemeasured by e.g. RNAseq technology. Where the one or more target signalsare proteins they can be measured in other ways. For example, the targetsignal may be one or more luminescent or fluorescent proteins.Preferably the one or more target signals are different fluorescentproteins (e.g. green, yellow, red, or cyan fluorescent proteins—GFP,YFP, RFP and CFP, respectively) and the measurements relating to theamount of the one or more target signals in the cell culture over thetime period in the method are measurements of the amount of fluorescenceat a particular wavelength corresponding to the type of fluorescentprotein being used. In an example, the one or more target signals areselected from yellow fluorescent protein (YFP), cyan fluorescent protein(CFP) and red fluorescent protein (RFP).

In the detailed description below the method is formulated withreference to specific target signals, i.e. RFP, CFP and YFP. However, itwill be appreciated that other target signals can be used in these moredetailed methods and equations.

In a preferred embodiment of the invention a non-linear observer processis used to compare the mathematical models with the measurementsobtained from each cell culture. In particular, these measurements areusually cell-aggregated measurements that relate to the whole cellculture. For example, where measurements relate to the amount offluorescence at a particular wavelength from a particular well of amicrotitre plate, this is the fluorescence of the whole colony/culturewithin that well and therefore correspond to a cell-aggregated measure.However, the models describing the production of target signals (orother signals referred to herein) preferably describe intracellularinteractions and therefore intracellular concentrations. Therefore, tocompare the models with the bulk fluorescence data, a non-linearobserver process is used.

For example, where the signals are CFP and YFP, for each wavelength w,bulk fluorescent variables can be described as

B ₄₈₀=([CFP])+[F ₄₈₀])×c+B ^(back) ₄₈₀

B ₅₃₀=([YFP])+[F ₅₃₀])×c+B ^(back) ₅₃₀

where [Fw] is the per-cell autofluorescence at wavelength w, c is celldensity, and B^(back) _(W) is the constant background fluorescence atwavelength w.

The cell culture (or colony) can be a culture of eukaryotic orprokaryotic cells. In particular, the cells can be bacterial, mammalianor yeast cells. In one example the cells are E. coli cells or CHO cells.

The measurements relating to the density of the cell culture over thetime period can be any measurement which reflects cell culture density.Suitable ways of measuring cell culture density are known in the art.However, it is most convenient to measure the optical densities of thecell culture over the time period, in particular, measurements at awavelength of 600 nm, i.e. OD₆₀₀.

The length of the time period is not particularly limited but should belong enough to cover the cell culture growth phases of interest, e.g.one or more of the lag phase, the exponential phase, the stationaryphase and the death phase. Preferably the time period covers at leastthe exponential phase. During the time period measurements are taken atregular time intervals, which will depend on the speed of cell culturegrowth, usually every 5 to 20 minutes, or every 10 to 15 minutes.Preferably the measurements are taken as frequently as possible. Inreality this depends on what the machine set up allows.

The method may be performed with a single cell culture. However,preferably it is performed with measurements from a plurality ofseparate cell cultures, each being subjected to different cultureconditions, i.e. different external conditions. These conditions may berelevant to transcription (and optionally also translation) of the oneor more target signals, e.g. the presence or absence of differentnutrients or different concentrations of nutrients, etc. As describedfurther below, variations in cell behaviour between the plurality ofseparate cell cultures (e.g. wells of a microtitre plate) that have beensubjected to different conditions, can be quantified using themeasurements and the model described herein. These variations can beused to factor out assumedly equivalent variations in the functioning ofthe gene circuit comprising the DNA component and the one or more targetsignals, so as to establish the intrinsic properties of the DNAcomponent.

In particular, the plurality of cultures should be grown in separatecompartments. For example, the plurality of cell cultures may be grownin different wells of a microtitre plate. In this regard, thedescription below may refer to “well-specific” properties. However, itwill be appreciated that other containers or compartments may be used togrow the separate cell cultures.

The measurements obtained from the cell culture or plurality of cellcultures are used in order to quantify the one or more intrinsicproperties of the DNA component. In particular, these are: measurementsrelating to the density of the cell culture over the time period andmeasurements relating to the amount of one or more target signals in thecell culture over the time period. In one aspect of this method of theinvention, which is described further below, the measurements alsoinclude measurements relating to the amount of a reference signal in thecell culture over the time period.

The parts of the method can be performed separately, simultaneously orsequentially. In particular, it is not necessary for part (a) to havebeen completed for all measurements before (b) is started.

Similarly, the method refers to a first mathematical model and a furthermathematical model (which, as described below, can be designated as thethird mathematical model). However, the designations “first”, and“further” are arbitrary designations for ease of reference and are notintended to imply any order of use. Moreover, the first and furthermathematical models may be separate models or parts of a single model.

In one example of the method of this aspect of the invention, themeasurements may be ones obtained from a cell culture in which each cellfurther comprises a reference promoter that initiates transcription of areference signal; wherein the plurality of measurements of the methodfurther comprises measurements relating to the amount of the referencesignal in the cell culture over the time period; wherein the methodfurther comprises estimating parameter values for a second mathematicalmodel that describes the capacity of the cell culture to produce thereference signal over time, by minimizing a measure of the differencebetween the second mathematical model output and the measurementsrelating to the amount of the reference signal in the cell culture overthe time period, wherein the second mathematical model additionally usesthe parameter values estimated in (a); and the further mathematicalmodel being a third mathematical model, wherein said additional use ofparameter values comprises using the parameter values estimated based on(a) by the second mathematical model, taking the second mathematicalmodel to provide an estimate of the capacity of the cell culture toproduce the one or more target signals over time.

In particular, in this example the method may be regarded as a methodfor determining one or more intrinsic properties of a DNA component,from a plurality of measurements obtained over a time period from a cellculture, with each cell comprising the DNA component and a referencepromoter, wherein the DNA component is involved in transcription of oneor more target signals and the reference promoter initiatestranscription of a reference signal, wherein the plurality ofmeasurements comprises: (i) measurements relating to the density of thecell culture over the time period; (ii) measurements relating to theamount of the reference signal in the cell culture over the time period;and (iii) measurements relating to the amount of the one or more targetsignals in the cell culture over the time period, the method comprising:

-   -   (1) estimating parameter values for a first mathematical model        that describes cell culture density over time, by minimizing a        measure of the difference between the first mathematical model        output and the measurements relating to the density of the cell        culture over the time period;    -   (2) estimating parameter values for a second mathematical model        that describes the capacity of the cell culture to produce the        reference signal over time, by minimizing a measure of the        difference between the second mathematical model output and the        measurements relating to the amount of the reference signal in        the cell culture over the time period, wherein the second        mathematical model additionally uses the parameter values        estimated in (1);    -   (3) estimating parameters quantifying the intrinsic properties        of the DNA component, by embedding these parameters within a        third mathematical model that describes the production of the        one or more target signals over time, and by minimizing a        measure of the difference between the model outputs and the        measurements relating to the amount of the one or more target        signals, wherein the third mathematical model additionally uses        the parameter values estimated in (2) by taking the second        mathematical model to provide an estimate of the capacity of the        cell culture to produce the one or more target signals over        time.

In an embodiment, the method further comprises obtaining themeasurements referred to above by culturing cells comprising the DNAcomponent and the reference promoter, and taking measurements (i), (ii)and (iii) over the time period.

The method comprises the use of a plurality of measurements obtainedover a time period from a cell culture (or cell colony), wherein eachcell comprises the DNA component. The DNA component is involved intranscription of one or more target signals and the reference promoterinitiates/regulates transcription of a reference signal. In particular,the DNA component is part of a genetic circuit from which the one ormore target signals are transcribed. The reference promoter is part of agenetic circuit from which the reference signal is transcribed. Thesegenetic circuits can be integrated in the chromosomes of the cells ofthe culture. Alternatively, these genetic circuits can be on separategenetic elements such as plasmids, bacteriophage, cosmids, bacterialartificial chromosomes, yeast artificial chromosomes or mammalianartificial chromosomes. In particular, the genetic circuits can be onone or more plasmids. In one example the genetic circuit comprising thereference promoter (“the control circuit” as shown in FIG. 1A) isintegrated in a chromosome, whereas the genetic circuit comprising theDNA component (the “synthetic circuit” as shown in FIG. 1A) is on aplasmid. Most preferably the genetic circuit of the reference promoteris chromosomally integrated or is on the same plasmid as the geneticcircuit of the DNA component.

The copy number of the genetic circuit comprising the DNA component andthe copy number of the genetic circuit comprising the reference promotermay be known. The genetic circuit comprising the DNA component and thegenetic circuit comprising the reference promoter may be on the sameextra chromosomal element, e.g. the same plasmid.

The one or more target signals and the reference signal may be RNA orprotein signals. It is desirable for the level of the one or more targetsignal and the reference signal to be measured repeatedly over the timeperiod without significantly interfering with or disrupting the growthof the cell culture. Where the one or more target signals and thereference signal are RNA they may be measured by e.g. RNAseq technology.Where the one or more target signals and reference signals are proteinsthey can be measured in other ways. For example, the target signal andthe reporter signal may be one or more luminescent or fluorescentproteins. Preferably the one or more target signals and the reportersignal are different fluorescent proteins (e.g. green, yellow, red, orcyan fluorescent proteins—GFP, YFP, RFP and CFP, respectively) and themeasurements of (ii) and (iii) in the method are measurements of theamount of fluorescence at a particular wavelength corresponding to thetype of fluorescent protein being used. In a preferred example, thetarget signal is yellow fluorescent protein (YFP) and the referencesignal is red fluorescent protein (RFP).

In particular, the reference promoter and reference signal are chosen tocapture the same external factors that influence transcription (andtranslation if the signals are protein signals) of the one or moretarget signals, i.e. such that the activity of the genetic circuitcomprising the reference promoter and reference signal depends on themetabolic state of the cell in an equivalent way to the genetic circuitcomprising the DNA component and the one or more target signals. Thereference promoter may be a constitutive promoter.

The cell culture (or colony) can be a culture of eukaryotic orprokaryotic cells. In particular, the cells can be bacterial, mammalianor yeast cells. Preferably the cells are E. coli cells or CHO cells.

The measurements relating to the density of the cell culture over thetime period can be any measurement which reflects cell culture density.Suitable ways of measuring cell culture density are known in the art.However, it is most convenient to measure the optical densities of thecell culture over the time period, in particular, measurements at awavelength of 600 nm, i.e. OD₆₀₀.

The length of the time period is not particularly limited but should belong enough to cover the cell culture growth phases of interest, e.g.one or more of the lag phase, the exponential phase, the stationaryphase and the death phase. Preferably the time period covers at leastthe exponential phase. During the time period measurements are taken atregular time intervals, which will depend on the speed of cell culturegrowth, usually every 5 to 20 minutes, and preferably around every 10 to15 minutes.

The method may be performed with a single cell culture. However,preferably it is performed with measurements from a plurality ofseparate cell cultures, each being subjected to different cultureconditions, i.e. different external conditions. These conditions may berelevant to transcription (and optionally also translation) of the oneor more target signals and the reference signal, e.g. the presence orabsence of different nutrients or different concentrations of nutrients,etc. As described further below, variations in cell behaviour betweenthe plurality of separate cell cultures (e.g. wells of a microtitreplate) that have been subjected to different conditions, can bequantified using the measurements and the model (that describes thecapacity of the cell culture to produce of the reference signal). Thesevariations can be used to factor out assumedly equivalent variations inthe functioning of the gene circuit comprising the DNA component and theone or more target signals, so as to establish the intrinsic propertiesof the DNA component.

In particular, the plurality of cultures should be grown in separatecompartments. For example, the plurality of cell cultures may be grownin different wells of a microtitre plate. In this regard, thedescription below may refer to “well-specific” properties. However, itwill be appreciated that other containers or compartments may be used togrow the separate cell cultures.

The measurements (i), (ii) and (iii) obtained from the cell culture orplurality of cell cultures are used in order to quantify the one or moreintrinsic properties of the DNA component. The parts of the method usingthese measurements are indicated as (1), (2) and (3). These parts can beperformed separately, simultaneously or sequentially. In particular, itis not necessary for (1) and (2) to have been completed for allmeasurements before (3) is started.

Similarly, (1), (2) and (3) of the method refer to a first mathematicalmodel, a second mathematical model and a third mathematical modelrespectively. However, the designations “first”, “second” and “third”are arbitrary designations for ease of reference and are not intended toimply any order of use. Moreover, the first, second and thirdmathematical models may be separate models or parts of a single model.

FIGS. 1A to C and 4A to D provide graphical depictions of thecharacterisation methods of the present invention.

FIGS. 1A to C show a graphical depiction of ratiometric dynamiccharacterisation method as an example of an aspect of the invention. Themethod seeks to infer the quantitative properties of a synthetic circuitwhile accounting for the feedback of the synthetic circuit consumingcellular resources and inhibiting growth. To approximate this closedloop system, an open loop approximation is proposed, which is shown inFIG. 1A, that allows each cell culture i (expressing different circuitsand/or experiencing different treatments) to have a model of cell growth(γ) and gene expression capacity (h) that are parameterized bycircuit/condition-specific parameters (ρ_i, ξ_(i)). The model of thesynthetic circuit f therefore embeds these circuit-specific factors. Aflow diagram shown in FIG. 1B illustrates connections betweenobservation data (grey boxes) and parameter sets (circle nodes) formultiple circuits. For each condition i, cell growth parameters ρ_(i)are inferred against OD₆₀₀ measurements. Next, gene expression capacityparameters are inferred against bulk fluorescence measurements ofchromosomally integrated constitutive expression of mRFP1 (F₅₅₀),simultaneously for all conditions, to enable both circuit-specific(ξ_(i)) and circuit non-specific (ϕ) parameters to be inferred. Finally,the parameters of the synthetic circuit are inferred against bulkfluorescence measurements of spectrally distinct fluorescent proteins(here ECFP and EYFP). If parameters of a model have been inferredpreviously, their marginal posteriors are propagated to the inference ofthe current circuit. This is indicated by the dashed arrows. As shown inFIG. 1C the computation graph of panel B can be condensed, but it shouldbe noted that arbitrary acyclic graphs can be defined over the outlinedprocedure.

FIGS. 4A to D shows a further graphical depiction of the dynamiccharacterisation methods according to the invention. (A.) The methodseeks to infer the quantitative properties of a synthetic circuit whileaccounting for the feedback of the synthetic circuit consuming cellularresources and inhibiting growth. To approximate this closed loop system,an open loop approximation is proposed that allows each cell culture i(expressing different circuits and/or experiencing different treatments)to be described by models of cell growth (γ) and gene expressioncapacity (h) that are parameterized by circuit/condition-specificparameters (ρ_(i), ξ_(i)). The model of the circuit f therefore embedsthese circuit-specific factors. (B,C.) A flow diagram illustratesconnections between observation data (grey boxes) and parameter sets(circle nodes) for multiple circuits. For each condition i, cell growthparameters ρ_(i) are inferred against OD₆₀₀ measurements. In ratiometriccharacterisation (panel C), gene expression capacity parameters areinferred against bulk fluorescence measurements of chromosomallyintegrated constitutive expression of mRFP1 (F₅₅₀), simultaneously forall conditions, to enable both circuit-specific (ξ_(i)) and circuitnon-specific (ϕ) parameters to be inferred. Finally, the parameters ofthe synthetic circuit are inferred against bulk fluorescencemeasurements of spectrally distinct fluorescent proteins (here ECFP andEYFP), and incorporate parameters for cell growth and for ratiometriccharacterisation. If parameters of a model have been inferredpreviously, their marginal posteriors are propagated to the inference ofthe current circuit. This is indicated by the dashed arrows. D. Thecomputation graph of panels B and C can be condensed to indicate thedependency of multiple circuits more succinctly.

Cell Growth Characterisation/Cell Growth Phase

In (a) of the method the cell growth is characterised. In particular,parameter values are estimated for a first mathematical model thatdescribes cell culture density over time, by minimizing a measure of thedifference between the first mathematical model output and themeasurements relating to density of the cell culture over the timeperiod. Where the method is performed with measurements (e.g.measurements (i), (ii) and (iii)) obtained from a plurality of separatecell cultures, the growth of each cell culture is modelled separately(i.e. (a) is performed separately for each cell culture) to explicitlycapture acceleration/deceleration of growth in response to variations intranscription activity and cell culture size in each culture.

Suitable mathematical models for characterising cell growth are known inthe art. Preferably the model is the logistic cell growth model, morepreferably the lag-logistic growth model.

The method of (a) may use an optimisation algorithm to minimize ameasure of the difference between the first mathematical model outputand the measurements relating to density of the cell culture over thetime period. In particular, minimizing a measure of the difference in(a) may be minimizing a sum of squared errors or the absolutedifferences between the first mathematical model and the measurementsrelating to the density of the cell culture over the time period.Preferably the minimizing a measure of the difference in (a) uses aNelder-Mead simplex algorithm.

In the description below the first mathematical model is the laglogistic growth model and the measurements relating to the density ofthe cell culture are optical density OD₆₀₀ measurements. Still further,the measurements of (i) comprise measurements relating to the celldensity of a plurality of separate cell cultures over the time periodgrown in separate wells of a microtitre plate. These embodiments arepreferred, however, it will be appreciated that (a) can be performedwith other cell growth models, and with other measurements relating tothe density of the cell culture. Moreover, containers other thanmicrotitre plates can be used to grow the plurality of cell cultures.

The lag logistic growth model is described by:

$\frac{dc}{dt} = \left\{ \begin{matrix}{0,} & {t < t_{lag}} \\{{{rc}\left( {1 - \frac{c}{K}} \right)},} & {t \geq t_{lag}}\end{matrix} \right.$

where c is cell density, r is the per capita colony growth rate and K isthe carrying capacity, which represents the maximum colony size. Thetime-evolution of this equation depends also on the initial celldensity, c⁰:=c (t=0). In the measured OD₆₀₀ signal, there isnon-cellular background signal, which is parameterized as a(plate-level) parameter X_(b), which is shared by the separate cellcultures. Therefore, the time evolution of the OD signal of a singlecell colony (e.g. in well i) is modelled as:

${x_{i}(t)} = {\frac{K_{i}e^{r_{i}t}c_{0}}{K_{i} - c_{i}^{0} + {e^{r_{i}t}c_{i}^{0}}} + x_{b}}$

where r_(i), K_(i), C⁰ _(i) are the specific cell growth parameters ofthe cell colony in well i.

By assuming that the initial cell density is small, it can be taken thatX_(i) (0)≈X_(b). Therefore, the parameter X_(b) can be identified as theaverage over the first time point of the OD signal in each cell culture(e.g. in each well). This leaves the identification of the culturespecific (e.g. well-specific) parameters as a set of inference problemsin which we seek to minimize the sum of the square errors (SSE) betweenmodel and data for each well separately

$P_{i}^{OD} = {\min\limits_{\theta_{i}}\left\{ {{SSE}\left( {\theta_{i}❘{OD}} \right)} \right\}}$

where θi=(r_(i), K_(i), c⁰ _(i)). For simple optimisation problems overfew parameters, it is possible to obtain good performance (bothreliability and efficiency) using direct search methods such as theNelder-Mead simplex algorithm (Nelder et al., Computer Journal (1964);7: 308-313). The implementation in MathNet.Numerics(http://numerics.mathdotnet.com/) can be used for cell growthcharacterisation.

Alternatively, the cell growth can be characterised by the lag logisticmodel as follows:

If x_(k) is defined to be the measured signal corresponding to OD₆₀₀ attime-points t_(k) (k=1, . . . n_(t)), then this can be related to amodel of cell growth described by c (t) according to

x _(k) ˜{circumflex over (x)}(t _(k))=c(t _(k))+x _(b)

where x_(b) is the background fluorescence and c representsGaussian-distributed measurement noise with standard deviation σ.

To model cell density, the following general equation is used

$\frac{dc}{dt} = {{\gamma\left( {c;\rho} \right)}.c}$

where γ (c;ρ) is the specific growth rate function that is parameterizedby ρ. For the lag-logistic model, γ is defined by

${\gamma\left( {c,\rho} \right)} = \left\{ \begin{matrix}{0,} & {t < t_{lag}} \\{{r\left( {1 - \frac{c}{K}} \right)},} & {t \geq t_{lag}}\end{matrix} \right.$

Here r is the per capita colony growth rate, K is the carrying capacityrepresenting the maximum colony size and t_(lag) is the duration of thelag phase of bacterial growth. Accordingly, the parameters of cellgrowth for colony i are given by ρ_(i)={r_(i), K_(i), t_(lag,i)}. Thetime-evolution of c(t) depends also on the initial cell density, c⁰:=c(t0), which is fixed to the intended dilution factor (0.2%) duringpreparation of the assay. In the measured OD₆₀₀ signal, there is anon-cellular background signal, which is parameterized as a plate-levelshared parameter x_(b). As a closed-form solution exists for thelag-logistic model, the time evolution of the OD signal in well i cansimply be modelled as

${{\hat{x}}_{i}(t)} = {\frac{K_{i}e^{r_{i}({t - t_{lag}})}c^{0}}{K_{i} - c^{0} + {e^{r_{i}({t - t_{lag}})}c^{0}}} + x_{b}}$

By assuming that the initial cell density is small, it can be taken thatc_(i) (0)≈0, and so the background absorbance measurement will beapproximately equal to the initial absorbance measurement. Therefore,the parameter x_(b) is identified as the average over the first timepoint of the OD signal in each well. This leaves the identification ofthe well-specific parameters as a set of inference problems in which weseek to minimize the deviation between the data (x_(k)) and simulation({circumflex over (x)}(t_(k))).

To obtain parameter estimates, it is assumed that the deviation betweendata and simulation is Gaussian-distributed with standard deviation σ.We seek to maximise the log-likelihood of producing the data in well iwith parameters ρ_(i)

$\max\limits_{\rho_{i}}\left\{ {l\left( \rho_{i} \right)} \right\}$

where l(ρ_(i)) is the sum of the log-probabilities of Gaussianobservations for x_(k)−{circumflex over (x)}_(k)(t_(i)), with varianceσ². For simple optimisation problems over few parameters, it is possibleto obtain good performance (both reliability and efficiency) usingdirect search methods such as the Nelder-Mead simplex algorithm (Nelderand Mead, The Computer Journal, 7(4):308-313, 1965). The implementationin MathNet.Numerics (http://numerics.mathdotnet.com/) can be used.

Characterisation of the Ratiometric Reference (Control) Signal/ControlPhase

In a further part of the method parameter values are estimated for asecond mathematical model that describes the capacity of the cellculture to produce the reference signal over time, by minimizing ameasure of the difference between the second mathematical model outputand the measurements relating to the amount of the reference signal inthe at least one cell culture over the time period, wherein the secondmathematical model additionally uses the parameter values estimated in(a).

In particular, the parametized first mathematical model from (a) can beused in in this part of the method to determine a time-dependentdilution rate for the cell culture (or for each cell culture of theplurality of cell cultures) as part of the second mathematical model.

The second mathematical model describes the capacity of the cell cultureto produce the reference signal over time, which can then be taken orused as an estimate of the capacity of the cell culture to produce theone or more target signals over time, i.e. the second mathematical modelcan serve as a proxy for the ability of the cell culture to produceother gene expression products, such as the one or more target signals.

Preferably the second mathematical model models the reference signal asa chemical reaction network. The method of this part may use anoptimisation algorithm to minimize a measure of the difference betweenthe second mathematical model output and the measurements relating tothe amount of the reference signal over the time period. In particular,minimizing a measure of the difference in this part may use a Markovchain Monte Carlo (MCMC) method, preferably a Metropolis-Hastingsalgorithm.

Where the method is performed with measurements (e.g. measurements (i),(ii) and (iii)) obtained from a plurality of separate cell cultures(subjected to different conditions), the capacity of each culture toproduce the reference signal over time is modelled separately (i.e. thispart is performed separately) for each cell culture. In this wayvariations between the different cell cultures can be quantified.

In the description below the reference signal is red fluorescent protein(RFP) and the measurements relating to the amount of reference signal inthe cell culture are measurements of the level of RFP fluorescence.These embodiments are preferred; however, it will be appreciated thatthis part of the method can be performed with another reference signal,and with other related measurements of the level of the reporter signal.

In particular, in this part of the method the fluorescence signals fromthe RFP can be modelled directly as a chemical reaction network (CRN),and this parameterized model can be used to apply equivalent extrinsicvariations to a model of the target fluorescence signal.

The model may utilise an open loop approximation and may not explicitlydescribe feedback of the gene circuit consuming cellular resources andinhibiting growth. In this regard, the model may include a descriptionof DNA and RNA, or DNA, RNA and protein but exclude a specificdescription of the host cell machinery. For example, it may exclude aspecific description of polymerases and/or ribosomes.

Specifically, constitutive production of RFP mRNA, followed bytranslation and fluorescent protein maturation can be described by thefollowing CRN:

$\begin{matrix}{\varnothing\overset{k_{m}}{\rightarrow}{mRNA}} \\{{mRNA}\overset{k_{p}}{\rightarrow}{{mRNA} + {iRFP}}} \\{{iRFP}\overset{k_{mat}}{\longrightarrow}{RFP}} \\{{mRNA}\overset{d_{m} + \gamma}{\longrightarrow}\varnothing} \\{{iRFP}\overset{d_{RFP} + \gamma}{\longrightarrow}\varnothing} \\{{RFP}\overset{d_{RFP} + \gamma}{\longrightarrow}\varnothing}\end{matrix}$

where k_(m), and pc are transcription, translation and maturation of RFPrespectively, γ is the rate of dilution, and d_(m), d_(RFP) aredegradation rates of mRNA and (RFP) protein. Assuming mass action, thecorresponding ordinary differential equations for the time evolution ofthe concentrations of each molecule can be written as:

${\frac{d\lbrack{mRNA}\rbrack}{dt} = {k_{m} - {\left( {d_{m} + \gamma} \right)\lbrack{mRNA}\rbrack}}}{\frac{d\lbrack{iRFP}\rbrack}{dt} = {{k_{p}\lbrack{mRNA}\rbrack} - {\left( {d_{RFP} + \gamma + \mu_{R}} \right)\lbrack{iRFP}\rbrack}}}{\frac{d\lbrack{RFP}\rbrack}{dt} = {{\mu_{R}\lbrack{iRFP}\rbrack} - {\left( {d_{RFP} + \gamma} \right)\lbrack{RFP}\rbrack}}}$

Here, the effect of dilution is described as a quantity γ^((i)), where idenotes the particular colony being measured. This dilution termrepresents the decline in concentration as the volume of cellsincreases, and since the cell growth for each cell culture (e.g. eachwell) is characterised, the parameterized logistic model from step (a)can be used to determine an accurate time-dependent dilution rate foreach cell culture (e.g. each well in the microtiter plate). Inparticular, for well i, the specific growth rate may be assigned asγ^((i))=r_(i) (1−c/K_(i)) (for t≥t_(lag), and 0 otherwise).

The dynamics of transcription and mRNA turnover are usually faster thantranslation and protein turnover, and therefore a separation oftimescales argument can be applied to simplify the model equations (5).By assuming that the mRNA equation is in a quasi-equilibrium, equation5a can be approximated as:

$\lbrack{mRNA}\rbrack^{*} \approx \frac{k_{m}}{d_{m} + \gamma}$

Accordingly, the simplified model can be updated. In addition, if thematuration time is faster than the turnover time for the fluorescentprotein, then an even simpler model can be derived.

In order to describe measurements of a circuit exposed to differentconditions, the model sets in which some parameters are sample-specificcan be considered and inter-sample variations can be incorporated, whileothers are assumed to be fixed properties of the biochemistry.Accordingly, for a sample i, the simplified model of the reference(control) signal becomes

$\frac{{d\lbrack{RFP}\rbrack}^{(i)}}{dt} = {k_{c}^{(i)} - {\left( {d_{RFP} + \gamma^{(i)}} \right)\lbrack{RFP}\rbrack}^{(i)}}$

In this formulation, k_(c) ^((i)) represents the gene expressioncapacity of the cells in sample i.

A sample specific k_(c) enables the capture of differences intranscription, translation and mRNA degradation that arise fromdifferences in the general state of the cells under their measuredconditions. This might include an increased burden on the ribosome poolleading to lower translation (Gorochowski et al., ACS Synthetic Biology,2014, 3 (3): 129-139; Ceroni et al., Nature Methods, 2015, 12(5):415-418), increased burden on polymerases leading to lower transcription(Tan et al., Nature Chemical Biology, 2009, 5(11): 842-848), ordifferences in mRNA degradation of different mRNA sequences (Bouvet andBelasco, Nature, 1992, 360(6403): 488-491). Accordingly, these caninclude both intrinsic and extrinsic properties (Rudge et al., ACSSynthetic Biology, 2016, 5(1): 89-98) of gene circuits. By describingcell growth and gene expression rates with well-specific values, we canarbitrarily account for any interdependency between the two (Scott etal., Science, 2010, 330(6007): 1099-1102), despite not explicitlymodelling the feedback process.

In principle, gene expression capacity could be any arbitrarytime-dependent function, which could seek to describe the feedback fromhigh circuit activity and therefore strong metabolic burden on cellulargene expression. But in practice, it is possible to make assumptionsabout the form of these functions, in order to make parameter inferencetractable.

In particular, the assumption can be made that gene expression capacityswitches between two values, r_(exp) and r_(stat), representingexponential and stationary phase growth. This implements the hypothesisthat gene expression capacity will be generally higher duringexponential phase, and transitions to a lower rate as the cell cultureapproaches its carrying capacity, and metabolism slows down. The ODmeasurements and logistic model can be used to define when thetransition occurs, by defining a critical colony density K_(c). Puttingthis together, the function used can be defined as

${k_{c}^{(i)} = \frac{{r_{c}^{(i)}\lbrack x\rbrack}^{n_{c}} + {r_{s}K_{c}^{n_{c}}}}{\lbrack x\rbrack^{n_{c}} + K_{c}^{n_{c}}}},$

where n_(c) determines how fast K_(c) ^((i)) switches between r_(c)^((i)) and r^(s).

Finally, it is important to note that uniform values of r^(s), k_(c) andn^(c) are used to prevent overfitting.

The parameter inference problem can therefore be defined for this partof the method. For convenience, we define θ to be the vector ofparameter values sought. For the switch hypothesis, values of r_(c)^((i)), r^(s), K^(c) and n^(c) are sought to parameterise the switchfunction, in addition to the hypothesis-independent parameter d_(RFP).Given estimates of the logistic growth parameters (r^((i)), K^((i)), c₀^((i))) for each measured cell culture obtained in (a), the parametersof the gene expression capacity function can be inferred by comparingRFP fluorescence measurements with simulations of the simplified model(Grant et al., Molecular Systems Biology, 2016, 12(1): 849-849). Incontrast to the optimisation procedure in (a), as indicated above theMarkov chain Monte Carlo (MCMC) methods are preferred for theoptimisation procedure in (b). In particular, the Nelder-Mead algorithmwas found to perform poorly for this inference problem, frequentlygetting stuck in sub-optimal global optima of the cost function. MCMCmethods are able to reduce the impact of this problem. In practice, wefound consistent convergence to single optimal regions in parameterspace for each problem that we attempted.

The Metropolis-Hastings algorithm as implemented in the Filzbachsoftware may be used to perform the MCMC parameterizations. Thisrequires first specifying a likelihood function that evaluates thelikelihood score for a candidate parameter set θ. If we denote byR_(i,1), . . . , R_(i,n) the RPF measurements of well i at time pointst₁, . . . , t_(n), then the associated likelihood function can bedefined as:

${L(\theta)} = {\prod\limits_{i = 1}^{n_{W}}{\prod\limits_{j = 1}^{n}{\frac{1}{\sqrt{2\sigma}}\exp\left\{ {- \frac{\left( {{\lbrack{RFP}\rbrack^{(i)}\left( t_{j} \right)} - R_{i,j}} \right)^{2}}{2\sigma^{2}}} \right\}}}}$

where n_(W) is the number of wells.

As a variation to the above, gene expression capacity for a referencesignal such as red fluorescent protein (RRP) can be modelled as follows:

Constitutive production of RFP mRNA, followed by translation andfluorescent protein maturation can be described by the followingchemical reaction network:

${g_{RFP}\overset{k_{m}}{\rightarrow}{g_{RFP} + m_{RFP}}}{m_{RFP}\overset{k_{p}}{\rightarrow}{m_{RFP} + {iRFP}}}{{iRFP}\overset{k_{mat}}{\longrightarrow}{RFP}}{m_{RFP}\overset{d_{m} + \gamma}{\longrightarrow}\varnothing}{{iRFP}\overset{d_{RFP} + \gamma}{\longrightarrow}\varnothing}{{RFP}\overset{d_{RFP} + \gamma}{\longrightarrow}\varnothing}$

where k_(m), k_(p) and k_(mat) (μ_(R)) are transcription, translationand maturation of RFP respectively, γ is the rate of dilution, andd_(m), d_(RFP) are degradation rates of mRNA and (RFP) protein.

From the reaction set of the above paragraph, corresponding ordinarydifferential equations for the time evolution of the concentrations ofeach molecule, assuming mass action kinetics can be written as

${\frac{d\left\lbrack m_{RFP} \right\rbrack}{dt} = {k_{m} - {\left( {d_{m} + \gamma} \right)\left\lbrack m_{RFP} \right\rbrack}}}{\frac{d\lbrack{iRFP}\rbrack}{dt} = {{k_{p}\left\lbrack m_{RFP} \right\rbrack} - {\left( {d_{RFP} + \gamma + \mu_{R}} \right)\lbrack{iRFP}\rbrack}}}{\frac{d\lbrack{RFP}\rbrack}{dt} = {{\mu_{R}\lbrack{iRFP}\rbrack} - {\left( {d_{RFP} + \gamma} \right)\lbrack{RFP}\rbrack}}}$

Here, the effect of dilution is described by the specific growth rate γ,from the following equation already shown above:

${\gamma\left( {c,\rho} \right)} = \left\{ \begin{matrix}{0,} & {t < t_{lag}} \\{{r\left( {1 - \frac{c}{K}} \right)},} & {t \geq t_{lag}}\end{matrix} \right.$

This dilution term represents the decline in concentration as the volumeof cells increases, and since the cell growth for each well ischaracterised, the parameterized logistic model can be used to determinean accurate time-dependent dilution rate for each well in the microtiterplate.

The dynamics of transcription and mRNA turnover are usually faster thantranslation and protein turnover, and therefore a separation oftimescales argument can be applied to simplify the model equations (ofthe above paragraph) as [m_(RFP)]*≈k_(m)/d_(m)+γ. By further assumingthat the mRNA dynamics are faster than dilution, the above expressionbecomes constant, and can be incorporated into a single quantityincorporating the mRNA equilibrium and the translation rate k_(p). Ifthe maturation time is faster than the turnover time for the fluorescentproteins, then an even simpler model can be derived:

$\frac{{d\lbrack{RFP}\rbrack}_{i}}{dt} = {a_{RFP} - {\left( {d_{RFP} - {\gamma_{i}(c)}} \right)\lbrack{RFP}\rbrack}_{i}}$

where α_(RFP) incorporates transcription, translation, fluorescentprotein maturation and mRNA degradation rates.

The reaction model above assumes a particular level of abstraction fordescribing the inserted biochemical components. Specifically, it enablesthe description of each component of the central dogma (DNA, RNA,protein) explicitly, but ignores host cell machinery (RNA polymerase,ribosomes, proteases, etc). In particular, an open loop analogue isparameterised. In the previous section regarding cell growthcharacterisation, it was shown that well-specific parameters can be usedto describe well-specific growth curves. In essence, this is the firststep towards an open loop system. When the synthetic circuit createsmore burden on the cell, lower values of r_(i) and K_(i) will beobserved.

A quantity h_(i) (c; ξ_(i)) can be introduced which is a function ofsample-specific parameters ξ_(i) that quantifies the gene expressioncapacity of sample i such that all gene expression terms in the modelare proportional to h_(i). As the RFP measurements are used toparameterise h_(i), α_(RFP) is exchanged for h_(i) in the equationabove, as it would not be possible to separate the two values ifexpressed as a product. As such, the values inferred for h_(i) are inunits of (mature) RFP molecules per unit time.

A sample-specific (enables differences in transcription, translation andmRNA degradation to be captured that arise from differences in thegeneral state of the cells under their measured conditions. This mightinclude an increased burden on the ribosome pool leading to lowertranslation, increased burden on polymerases leading to lowertranscription, or differences in mRNA degradation of different mRNAsequences. Accordingly, these can include both intrinsic and extrinsicproperties of synthetic gene circuits. By describing cell growth andgene expression rates with well-specific values, any interdependencybetween the two can be arbitrarily accounted for, despite not explicitlymodelling the feedback process. In principle, gene expression capacitycould be any arbitrary time-dependent function, which could seek todescribe the feedback from high circuit activity and therefore strongmetabolic burden on cellular gene expression. Here, several contrastingyet relatively simple assumptions are considered about the functionalform of h_(i) (c).

To generalize the description of the control phase, we describe thecontrol circuit (from which the control/reference signal is produced) as

{dot over (x)} _(C) =f _(C)(x _(C) ,u _(i) ,h _(i);ϕ,ξ_(i))

y _(C) =g _(C)(x _(C))

where x_(c) is a vector of state variables that at least includes [RFP],μ_(i) are the treatments being applied to the circuit in well i, h_(i)is the model of gene expression capacity as defined above, Φ arewell-independent parameters and xi, are the parameters for h_(i) (c).

Characterisation of the DNA Component—Target Phase

In this part of the method parameters quantifying the intrinsicproperties of the DNA component are estimated, by embedding theseparameters within a further mathematical model that describes theproduction of the one or more target signals over time, and byminimizing a measure of the difference between the model outputs and themeasurements relating to the amount of the one or more target signals,wherein the further mathematical model additionally uses the parametervalues estimated in (a) (i.e. cell growth characterisation) or parametervalues based thereon. Where the further mathematical model usesparameter values estimated in (a) this is referred to as direct dynamiccharacterisation. Such a method does not seek to establish geneexpression capacity based on a reference signal. In particular, wherethe method is performed with measurements from a plurality of cellcultures characterisation does not attempt to quantify between-culturevariations in gene expression capacity.

Alternatively, the further mathematical model can be termed as the thirdmathematical model and use parameter values based on parameter valuesestimated for the first mathematical model. In one embodiment this meansthat the parameter values estimated for the first mathematical model areused in a second mathematical model which seeks to quantify overall geneexpression capacity of the cell colony, as described in the sectionabove. This further/third mathematical model uses the parameter valuesestimated using the second mathematical model by taking the secondmathematical model to provide an estimate of the capacity of the cellculture to produce one or more target signals over time.

In this part of the method parameters quantifying the intrinsicproperties of the DNA component are estimated, by embedding theseparameters within a further/third mathematical model that describes theproduction of the one or more target signals over time, and byminimizing a measure of the difference between the model outputs and themeasurements relating to the amount of the one or more target signals,wherein the third mathematical model additionally uses the parametervalues estimated in the part of the method involving the secondmathematical model by taking the second mathematical model to provide anestimate of the capacity of the cell culture to produce the one or moretarget signals over time.

As mentioned previously, the general idea of dynamic ratiometriccharacterisation is to use the measurements and model of the referencesignal to quantify variations in cell behaviour between separate cellcultures (e.g. in wells of a microtiter plate), and use these to factorout assumedly equivalent variations in the functioning of the geneticcircuit comprising the DNA component.

In the description below we describe the use of a genetic circuit inwhich the target signal is from yellow fluorescent protein and themeasurements relating to the amount of the target signal is measurementsof the level of YFP fluorescence. These embodiments are preferred,however, it will be appreciated that (c) can be performed with othertarget signals, and with other related measurements of the level of thetarget signal.

The simplest synthetic gene circuit involving constitutive expression ofYFP can be considered first. For consistency, the same chemical reactionnetwork model as for constitutive expression of the reference signal(equations 7 above) can be used, but with YFP-specific degradation andmaturation parameters.

$\begin{matrix}{\frac{{d\lbrack{YFP}\rbrack}^{(i)}}{dt} = {{\rho_{Y}k_{c}^{(i)}} - {\left( {d_{YFP} + \gamma^{(i)}} \right)\lbrack{YFP}\rbrack}^{(i)}}} & (10)\end{matrix}$

As transcription, translation and mRNA degradation will differ betweenYFP and RFP, and to incorporate the same notion of ratiometriccharacterisation from before (Yordanov et al., ACS Synthetic Biology,3(8):578-588, 2014; Rudge et al., ACS Synthetic Biology, 5(1):89-98,2016), we consider the gene expression capacity of YFP to beproportional to the gene expression capacity of RFP, with a factor ργ.

Alternatively, the method involving the further/third mathematical modelcan be performed using the compact general formulation

{dot over (x)} _(T) =f _(T)(x _(T) ,u _(i) h _(i);θ,ξ_(i))

y _(T) =g _(T)(x _(T))

Where x is a vector of state variables that at least includes variablesfor the intracellular concentration of each measured fluorescent proteinand θ contains the global circuit parameters. For ratiometric dynamiccharacterisation, h_(i) (c; ξ_(i)) is determined during the controlphase (utilising measurements relating to the amount of one or moretarget signals), with the maximum likelihood estimates of ξ_(i) used tosimulate f(x, u_(i), h_(i)).

For direct dynamic characterisation, h_(i) is simply set to be 1throughout.

Inferring Parameters of ODE Models Using Markov Chain Monte Carlo

According to the invention, the Markov chain Monte Carlo (MCMC) methodscan be used in either one or both of the parts of the method using thesecond or further/third mathematical models. In particular, in both ofthese parts of the method parameters are inferred for ordinarydifferential equation (ODE) models, given some observational data. It ispreferred that these parts of the method do not use the Nelder-Meadalgorithm.

MCMC methods also have the advantage of characterising the uncertaintyof parameter estimates, which can arise from several sources:measurement error, process error (molecular stochasticity) and modelmisspecification.

For notational convenience, θ is defined to be the vector of parametervalues sought. In particular, the Metropolis-Hastings algorithm asimplemented in the Filzbach software (http://www.github.com/predictionmachines/filzbach) can be used to perform the MCMC parameterizations.This requires specifying a function that evaluates the log-likelihoodscore for a candidate parameter set θ, and prior distributions of eachparameter, which encode prior belief of its plausible values.

Computing the log-likelihood. If we denote by y_(w, i, j), the bulkfluorescence wavelengths w (wϵ{480,530,610}) in well i (i=1, . . . ,n_(c)) at time-points t_(j) (j=1, . . . , n_(t)), then a likelihoodfunction for wavelength w can be defined as

${L_{w}(\theta)} = {{p\left( {y_{w}❘\theta} \right)} = {\prod\limits_{i}{\prod\limits_{j}{\frac{1}{\sqrt{2\sigma}}\exp\left\{ {- \frac{\left( {{B_{w}^{(i)}\left( t_{j} \right)} - y_{w,i,j}} \right)^{2}}{2\sigma^{2}}} \right\}}}}}$

where n_(c) is the number of colonies (e.g. wells in the 96-well plate).In the control phase, we simply use L₆₁₀ as the complete likelihoodfunction, while in the target phase we use L₄₈₀×L₅₃₀ when both CFP andYFP reporters are present in a circuit, and just one term otherwise. Theparameter σ describes the standard deviation of the data. Here, we infera during application of MCMC. As is commonly done in likelihood-basedanalyses, we work with the log of the likelihood score, as this isnumerically favourable. This is straightforward here, as the products inL_(w) become summations in log L_(w).

Specification of prior distributions. Prior distributions encode ourprior belief about the values of parameters. When characterising avariable that has not previously been used in a model before, it can bedifficult to know how to set the prior, so in this case we use a uniformdistribution with wide bounds. This prevents the MCMC sampler from beingswayed by an inappropriate value. When a parameter has been seen inanother analysis, we propagate the marginal posterior from the previousanalysis as a prior. To do this, we use truncated Gaussiandistributions, where the mean and standard deviation are calculated fromthe previous MCMC samples, and the bounds are taken as the uniform priorbounds of that same analysis.

Accordingly, in one embodiment, the method of the invention uses themarginal parameter posteriors of parameters that have been inferred inan upstream inference. In particular, the method may use a Gaussianapproximation of the marginal posterior of the parameter as a prior, butother parametric distribution approximations can also be used.

Computing the Posterior Predictive Distribution

To evaluate a model against unseen data, we approximate the posteriorpredictive distribution of the data, given our best estimates of thedistributions of the model's parameters. To do this, we formulate amodel of the synthetic gene circuit being measured. In this article, weshow calculations for the posterior predictive distribution of a circuitfor which the parameters have already been characterised, so all arespecified as truncated Gaussians, as described above. The predictionsare formed by applying the cell growth phase and control phase asappropriate for dynamic characterisation, but for the target phase weintegrate over the prior.

Accordingly, we approximate p(y′|y) by marginalising over the posteriorsof p(θ|y) to give approximate priors π_(θ) and then producing MonteCarlo samples θ_(k)˜π_(θ), as

$\begin{matrix}{{\log{p\left( {y^{\prime}❘y} \right)}} = {\log{\int{{p\left( {y^{\prime}❘\theta} \right)}{p\left( {\theta ❘y} \right)}d\theta}}}} \\{\approx {\frac{1}{N}{\sum\limits_{k = 1}^{N}{\log{p\left( {y^{\prime}❘\theta_{k}} \right)}}}}}\end{matrix}$

Further Aspects

As indicated above the method of the invention determines one or moreintrinsic properties of a DNA component. Knowledge of these properties,can then be used to adapt the DNA component and include it in a geneticcircuit to improve the performance of that circuit, e.g. by increasingthe number of copies of the DNA component in the genetic circuit so asto improve efficiency of transcription of a gene.

In particular, the present invention further provides a method ofoptimizing expression of at least one gene comprised in a geneticcircuit, wherein the genetic circuit further comprises a DNA componentwhich is involved in transcription of the at least one gene, wherein themethod comprises: (1) determining one or more intrinsic properties ofthe DNA component using the method of the invention described above; (2)using the one or more intrinsic properties of the DNA componentdetermined in (1) to simulate expression of the at least one gene fromthe genetic circuit in at least two different arrangements of thegenetic circuit; (3) selecting the arrangement in (2) that results inoptimal expression of the at least one gene; and (4) making thearrangement of the genetic circuit selected in step (3).

The invention further provides a computer program product embodied on acomputer readable storage and comprising code which is configured so asto perform the operations of the method for determining one or moreintrinsic properties of a DNA component (as described above) when run ona computer system.

The following are intended as examples only and do not limit the presentinvention.

EXAMPLES Results

To demonstrate how dynamic characterisation can be used to quantify theproperties of biological compo-nents, we investigated homoserine lactone(HSL) signalling components. These components have been studiedconsiderably in their natural contexts, including the Lux system from V.fischeri (Stevens et al., J. Bacteriol., 179(2): 557-562, 1997) and theLas system from P. aureginosa (Schuster et al., PNAS, 101(45):15833-15839, 2004). The Lux and Las systems have also been used insynthetic biology contexts, either alone (Danino et al., Nature,463(7279): 326-330, 2010), in co-culture (Balagadde et al., MolecularSystems Biology, 4(1):187, 2008), or integrated into the same hostorganism (Grant et al., Molecular Systems Biology, 12(1): 849, 2016).Here, we sought to apply dynamic characterisation to measurements ofsynthetic gene circuits that incorporate Lux and Las signallingcomponents—receivers, senders and degraders—to establish quantitativeestimates of their behaviours in E. coli.

Inference Graphs Enable Precise Specification of Parameter Dependencies.

An inference graph is defined as a Directed Acyclic Site Graph, whereeach node in the graph contains one or more sites and each site is asystem, referred to by the system name. Each node essentially denotes aninference problem, consisting of one or more systems with sharedparameters. Each edge between two nodes is labelled with a set ofparameters to propagate from the source node to the target node. Thesyntax of inference site graphs is defined as follows, where italicisednames denote syntax variables:

Property ::= | Fixed | Normal Parameter ::= | Name | Name = PropertyLocation ::= | Name | Name.Name Edge ::= Edge Location − >[Parameter_(1;)...; Parameter_(N)] Location Node ::= node Name systems =[Name₁;...; Name_(N)]; settings = Inference_settings;

We use a dot notation to refer to a specific system at a given node inthe graph. For the corresponding implementation, the system-specificparameters can be given global names by using the system name as aprefix to avoid parameter clashes. If only a single system is present ata node, we allow syntactic sugar so that the system name can be used asthe node name, and the system inference settings can be used as the nodeinference settings. Note that we also allow edges between nodesdirectly, to indicate that the parameters are shared between all systemspresent at the node. A node is executed by running inference on the nodeusing the distributions of the parameters from the in-bound nodes aspriors. The execution semantics essentially associates parameterposteriors to each node at the end of an inference run, where someparameters are local to a specific system at the node, while otherparameters are shared between all systems at the node

Modular Construction and Modelling of Synthetic Gene Circuits.

In order to characterise HSL signalling components, we designed acollection of synthetic gene circuits of increasing complexity, reusingcomponents in equivalent genetic contexts as much as possible (FIG. 5Ato F). For each circuit, a model was proposed that explicitly describesthe inserted components, with parameters reused when the same promoter,ribosome binding site and protein coding regions were used. The modelsare all derived from elementary chemical reactions but make severalassumptions about the timescales of different biological processes,culminating in only proteins and small molecules being explicitlydescribed by ordinary differential equations (as described elsewhereherein). In FIG. 5A to F, a code snippet shows the module calls within achemical reaction network (CRN) model, illustrating via a simplegraphical schema how the modular design of each circuit maps directlyonto a programmatic description. Also shown in FIG. 5A to F are examplecomparisons of calibrated model simulations against measurements, foreach circuit.

We started by characterising cellular autofluorescence, so that suchquantification could be used to account for autofluorescence inmeasurements of circuits expressing YFP and CFP explicitly (FIG. 5A). Toapportion cell number and autofluorescent material appropriately, wepharmacologically perturbed the cells with EtOH, which above a thresholdconcentration led to very slow cell growth. We found that the per-cellrate of autofluorescence production at 480 nm (corresponding to CFPmeasurements) was an order of magnitude higher than at 530 nm(corresponding to YFP measurements). The autofluorescence parameterswere then reused in subsequent circuits.

As CFP and YFP are used in the majority of the synthetic gene circuitsconsidered, we attempted to characterise their stability in a circuitfree of the complexity of inducible expression (FIG. 5B). Therefore, weexpressed each fluorescent protein constitutively using the PR promoter,and then measured this PRPR circuit for 36 hours. A model of the PRPRcircuit incorporated autofluorescence terms, which shared informationwith the Auto circuit in FIG. 5A. The observed curvature in thetime-series measurements of PRPR enabled identification of the CFP andYFP degradation parameters.

The first HSL components we characterised were the receiver proteins andHSL-inducible promoters, as these receiver components must be used incircuits that seek to characterise HSL senders and/or degraders. We usedthe double receiver (DR) circuits developed by the authors and theircollaborators in (Grant et al., Molecular Systems Biology, 12(1): 849,2016), which acts as a dual reporter of C6-HSL and C12-HSL (FIG. 5C).This is achieved by the constitutive expression of LuxR and LasR, andthe inducible expression of fluorescent reporter proteins by geneticallyaltered PLux promoters that respond almost orthogonally to regulatorscomposed of C6-HSL bound to LuxR or C12-HSL bound to LasR. In this way,CFP reports the concentration of C6-HSL and YFP reports theconcentration of C12-HSL, both across a broad range of concentrations.Because different promoters and ribosome binding sites are upstream ofthe coding sequences for CFP and YFP, as compared to the PRPR circuit,the parameters for their maximal production rate were assumed to take onnew parameters. However, the degradation parameters were assumed to bethe same as for PRPR, as the degradation rate is an intrinsic propertyof the protein. We used four different DR circuits, each incorporating aunique combination of ribosome binding sites in their LuxR and LasRexpression cassettes. As explained in previous work, this enables theeffect of variable intracellular concentrations of LuxR and LasR to bemodelled and therefore characterised quantitatively, which is criticalfor identifying chemical and genetic crosstalk (Grant et al., MolecularSystems Biology, 12(1): 849, 2016). We found that the accumulation offluorescent protein in response to different concentrations of C6-HSLand C12-HSL could be captured with parameterizations that were similarto those found previously (Grant et al., Molecular Systems Biology,12(1): 849, 2016), demonstrating that dynamic characterisation appliedto circuits with only moderate changes in promoter activity over timeproduces equivalent results to static methods.

We next characterised LuxI and LasI, the synthases of C6-HSL and C12-HSLrespectively. HSL sender relay circuits were designed that induciblyexpress the synthase of one signal in response to the other, producing atruly time-varying promoter activity for the response to the synthesizedsignal that therefore cannot be characterised using static methods. Thefirst relay uses the C6-HSL-responsive pLux76 promoter to driveexpression of LasI (abbreviated as P76-LasI from now on), leading tointracellular production of C12-HSL and subsequent induced expression ofYFP via the pLas81 promoter (FIGS. 5D & 12B). The second uses theC12-HSL responsive pLas81 promoter to drive expression of LuxI(P81-LuxI), leading to intracellular production of C6-HSL and subsequentinduced expression of CFP (FIG. 12A). Again, reusing componentcharacterisation of the previously described circuits, we modelledtime-series measurements of the response to different concentrations ofC6-HSL and C12-HSL, enabling quantification of the ability of Luxl andLasI to synthesize HSLs.

To characterise the HSL degrader AiiA (Liu et al., Biochemistry,47(29):7706-7714, 2008), we designed a circuit in which AiiA was placedunder the control of the arabinose-inducible PBad promoter (FIG. 5F), sothat different intracellular levels of AiiA could be producedexperimentally. However, analogous to the strategy used thus far, wefirst characterised a simpler circuit in which the properties ofarabinose induction of PBad could be quantified, by simply using PBad todrive YFP expression (FIGS. 5E & 13 ). Incorporating the YFPautofluorescence parameter from above, our model of the Pbad circuitclosely described the experimental measurements in response to differentconcentrations of arabinose (FIG. 5E) and enabled identification of thetransfer function parameters for arabinose-induction. Embedding thecharacterised PBad and DR modules into the model of the AiiA circuitenabled modelling of the response to different concentrations of C6-HSL,C12-HSL and arabinose. Of particular importance in reproducing theobserved bulk fluorescence dynamics was the incorporation of the effecton cell growth of high arabinose induction of AiiA, which was observablein the OD₆₀₀ data, and therefore quantified during the cell growth phaseof dynamic characterisation.

Time-Varying Gene Expression Capacity Improves Model Fit in StationaryPhase.

The simplest assumption that can be made about cellular gene expressioncapacity (h(c)) is that it remains constant through time. However,because circuit activity changes over time, this assumption is likely tobreak down, as RNA polymerases and ribosomes may become limiting whencircuit activity increases. As such, we considered four alternativetime-dependent functions for defining h(c) and applied dynamiccharacterisation to each circuit (Table 1).

TABLE 1 Functional forms used to define gene expression capacity(h_(i)(c)). Description Function (h_(i)(c)) Constant r_(c) ^((i))TargetGrowth r_(c) ^((i))((1 − ϵ_(c))γ(c) + ϵ_(c)) Tracks thegrowth-rate γ, falling to a basal level ϵ_(c). TargetGrowthDelay r_(c)^((i))((1 − ϵ_(c))γ(c(t − τ)) + ϵ_(c)) Tracks the growth-rate γ at timet − τ TargetRSwitch Sigmoid that switches at c = K_(c) from awell-specific value r_(c) to a relative value ϵ_(c)$r_{c}^{(i)}\frac{c^{n_{c}} + {\epsilon_{c}K_{c}^{n_{c}}}}{c^{n_{c}} + K_{c}^{n_{c}}}$TargetSwitch Sigmoid that switches at c = K_(c) from a well-specificvalue r_(c) to a non-well-specific basal value r_(s)$\frac{{r_{c}^{(i)}c^{n_{c}}} + {r_{s}K_{c}^{n_{c}}}}{c^{n_{c}} + K_{c}^{n_{c}}}$

In each case, the well-specific component rc^((i)) is inferred duringthe control phase, under a Constant hypothesis. Then, thenon-well-specific parameters (e.g. ϵ_(c) in TargetGrowth) is inferred inthe target phase alongside the target circuit parameters θ. As such,these hypotheses assume that the gene expression capacity for thechromosomally integrated control undergoes contrasting regulation fromthat of the plasmid-expressed circuit components. In principle, everygene expressed on the plasmid might be regulated differently, whichwould require gene-specific functions and/or parameterisations. Here, weonly consider one such function and parameterisation that is applied tothe whole circuit, preventing a combinatorial increase in the number ofparameters to infer.

Dynamic Characterisation can be Applied Sequentially or Simultaneouslyto Multiple Datasets.

We describe herein how dynamic characterisation can be applied in asequential manner to measurements of circuits of increasing complexity.Parameters of one circuit can be reused in models of circuits that embedcomponents of the upstream circuit (FIG. 4B-D), by propagating theposterior marginal distributions of the upstream circuit parameters aspriors of the downstream circuit parameters. For the six circuits shownin FIG. 5A-F, the dependency graph is non-trivial (FIG. 8 ), butacyclic, as is necessary for this approach. The alternative strategy toa graph-based approach would be to simultaneously parameterize themodels of each circuit in a single application of dynamiccharacterisation. With the graph method one should only marginalize overMCMC chains that have reached the same local maximum of the likelihoodfunction, and discard those stuck in other local maxima. With the

A comparison of the two methods can be seen in FIGS. 6G and H.

Methods Method for Obtaining Measurements of Synthetic Gene Circuits ina Microplate Fluorometer.

Plate fluorometer assays were conducted as previously described by Grantet al., (Mol Syst Biol. (2016) 12(1): 849, 2016). Briefly, overnightcultures of cells containing constitutive chromosomal mRFP1 and theplasmid construct of interest were diluted 1:100, grown to an OD ofapproximately 0.5, then diluted 1:1000 into M9 supplemented with 0.2%casamino acids and 0.4% glucose. 200 μl of culture was aliquoted intoeach well and measurements were taken every 10 min for 1,000-2,000 minin a BMG FLUOstar Omega plate fluorometer. 3-oxohexanoyl-homoserinelactone (3OC6HSL, Cayman Chemicals) and 3-oxododecanoyl-homoserinelactone (3OC12HSL, Cayman Chemicals) were dissolved to a concentrationof 200 mM in DMSO then 3OC6HSL was diluted in supplemented M9 medium tothe concentrations described, while 3OC12HSL, due to its limitedsolubility in aqueous media, was first diluted 1:50 in ethanol thendiluted in supplemented M9 medium to the concentrations described. A 1Marabinose (Sigma) stock solution was made in water, filter sterilized,and diluted in supplemented M9 medium to the concentrations described.HSL receiver and sender plasmids were previously described (Mol SystBiol. (2016) 12(1): 849, 2016), and all other plasmids were constructedusing Gibson Assembly (Gibson et al., Nature Methods, 6(5): 343-345, May2009) from parts obtained from the MIT Registry of Standard BiologicalParts (http://partsregistry.org)

Example 1—No Plasmid (Auto)

The simplest possible cell line to characterise is one in which there isno synthetic gene circuit at all. How-ever, applying dynamiccharacterisation to no circuit is useful to characterise howautofluorescence de-pends on cell density and gene expression capacity.Therefore, our chromosomal RFP-expressing cells were measured under arange of conditions, to explore how gene expression capacity influencedtime-series measurements at fluorescence wavelengths corresponding toeYFP and eCFP. Design of chromosomal RFP circuit is shown in FIG. 9 .

Model Definition

The model we used for autofluorescence assumes that the rate ofautofluorescence is proportional to gene expression capacity, h(c), andthat the fluorescent material dilutes with cell growth. Bulkautofluorescence is then quantified in exactly the same way as it iswhen there is a synthetic gene circuit inside the cells. As such, theequations for intracellular autofluorescence corresponding to eYFP andeCFP are

$\begin{matrix}{\frac{dc}{dt} = {{\gamma(c)}.c}} & \left( {1a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left\lbrack F_{530} \right\rbrack}{dt} = {{a_{530}.{h(c)}} - {{\gamma(c)}\left\lbrack F_{530} \right\rbrack}}} & \left( {1b} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left\lbrack F_{480} \right\rbrack}{dt} = {{a_{480}.{h(c)}} - {{\gamma(c)}\left\lbrack F_{480} \right\rbrack}}} & \left( {1c} \right)\end{matrix}$

To compare with experimental measurements, we consider the bulkfluorescence given by

B ₄₈₀ =c·[F ₄₈₀ ]+B ₄₈₀ ^(back)  (2a)

B ₅₃₀ =c·[F ₅₃₀ ]+B ₅₃₀ ^(back)  (2b)

where [F₄₈₀] and [F₅₃₀] are modelled as in back

B ₆₁₀=([RFP])×c+B ₆₁₀ ^(back)

B ₄₈₀=([CFP]+[F ₄₈₀])×c+B ₄₈₀ ^(back)

B ₅₃₀=([YFP]+[F ₅₃₀])×c+B ₅₃₀ ^(back)

[YFP] and [CFP] are set to zero, as they are not expressed in thiscircuit.

Characterisation Experiment

We infer the Auto model parameters by using uniform priors on α₄₈₀,α₅₃₀, B₄₈₀ ^(back) and B₅₃₀ ^(back). The parameters within y are takento be the maximum likelihood estimates from the cell densitycharacterisation phase, and the (are either taken to be the maximumlikelihood estimates from the control characterisation phase or areinferred within the target phase, depending on the hypothesis inquestion.

Example 2—Constitutive Expression (PRPR)

The simplest circuit that we consider in this article uses constitutivepromoters to drive expression of eYFP and eCFP as shown in FIG. 10 .Using such a simple circuit enables us to characterise properties of thefluorescent proteins, which can then be reused in to aidecharacterisation of circuits using the same fluorescent proteins.

Model Definition

As this is the first (non-trivial) synthetic gene circuit that wedescribe, we include a detailed derivation. We start by considering thechemical reactions introduced by the circuit in FIG. 10 .

If we denote by g the plasmid containing the PL:eYFP and PL:eCFPcassettes, then we can write a system of chemical reactions thatdescribe transcription, translation and fluorescent protein maturationas

$\begin{matrix}{{g\overset{a_{r0011}}{\longrightarrow}g} + {{m\_ b0034}{\_ eYFP}}} & \left( {3a} \right)\end{matrix}$ $\begin{matrix}{{m\_ b0034}{{\_ eYFP}\overset{d_{m\_ b0034\_{eYFP}}}{\longrightarrow}\varnothing}} & \left( {3b} \right)\end{matrix}$ $\begin{matrix}{{{m\_ b0034}{{\_ eYFP}\overset{k_{b34.{h(c)}}}{\longrightarrow}{m\_ b0034}}{\_ eYFP}} + {eYFPi}} & \left( {3c} \right)\end{matrix}$ $\begin{matrix}{{eYFPi}\overset{d_{YFP}}{\rightarrow}\varnothing} & \left( {3d} \right)\end{matrix}$ $\begin{matrix}{{eYFPi}\overset{\mu_{YFP}}{\rightarrow}{eYFP}} & \left( {3e} \right)\end{matrix}$ $\begin{matrix}{{eYFP}\overset{d_{YFP}}{\rightarrow}\varnothing} & \left( {3f} \right)\end{matrix}$ $\begin{matrix}{{g\overset{a_{r0011}}{\longrightarrow}g} + {{m\_ b}0034{\_ eCFP}}} & \left( {3g} \right)\end{matrix}$ $\begin{matrix}{{m\_ b0034}{{\_ eCFP}\overset{d_{m\_ b0034\_{eCFP}}}{\longrightarrow}\varnothing}} & \left( {3h} \right)\end{matrix}$ $\begin{matrix}{{{m\_ b0034}{{\_ eCFP}\overset{k_{b34.{h(c)}}}{\longrightarrow}{m\_ b}}0034{\_ eCFP}} + {eCFPi}} & \left( {3i} \right)\end{matrix}$ $\begin{matrix}{{eCFPi}\overset{d_{CFP}}{\rightarrow}\varnothing} & \left( {3j} \right)\end{matrix}$ $\begin{matrix}{{eCFPi}\overset{\mu_{CFP}}{\rightarrow}{eCFP}} & \left( {3k} \right)\end{matrix}$ $\begin{matrix}{{eCFP}\overset{d_{CFP}}{\rightarrow}\varnothing} & \left( {3l} \right)\end{matrix}$

where h(c) is the gene expression capacity, as introduced above in thedetailed description.

Translating to ODEs, and replacing [g] with N, the plasmid copy number,we obtain

$\begin{matrix}{\frac{dc}{dt} = {{\gamma(c)}.c}} & \left( {4a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left\lbrack {{m\_ b0034}{\_ eYFP}} \right\rbrack}{dt} = {{a_{r0011}.N} - {\left( {d_{m\_ b0034\_{eYFP}} + {\gamma(c)}} \right)\left\lbrack {{m\_ b0034}{\_ eYFP}} \right\rbrack}}} & \left( {4b} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{eYFPi}\rbrack}{dt} = {{k_{b0034}.{h(c)}.\left\lbrack {{m\_ b0034}{\_ eYFP}} \right\rbrack} - {\left( {d_{YFP} + \mu_{YFP} + {\gamma(c)}} \right)\lbrack{eYFP}\rbrack}}} & \left( {4c} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{eYFP}\rbrack}{dt} = {{\mu_{YFP}\lbrack{eYFPi}\rbrack} - {\left( {d_{YFP} + {\gamma(c)}} \right)\lbrack{eYFP}\rbrack}}} & \left( {4d} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left\lbrack {{m\_ b0034}{\_ eCFP}} \right\rbrack}{dt} = {{a_{r0011}.N} - {d_{m\_ b0034\_{eCFP}}\left\lbrack {{m\_ b0034}{\_ eCFP}} \right\rbrack}}} & \left( {4e} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{eCFPi}\rbrack}{dt} = {{k_{b0034}.{h(c)}.\left\lbrack {{m\_ b0034}{\_ eCFP}} \right\rbrack} - {\left( {d_{CFP} + \mu_{CFP} + {\gamma(c)}} \right)\lbrack{eCFPi}\rbrack}}} & \left( {4f} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{eCFP}\rbrack}{dt} = {{\mu_{YFP}\lbrack{eCFPi}\rbrack} - {\left( {d_{CFP} + {\gamma(c)}} \right)\lbrack{eCFP}\rbrack}}} & \left( {4g} \right)\end{matrix}$

where γ(c) is the cellular dilution of each molecular concentration.

Assumption 1: By assuming that mRNA dynamics are fast, we can remove themRNA species from the model entirely. That is, we equate (4b) and (4e)to zero, solve for [m_b0034_eYFP] and [m_b0034_eCFP], and thensubstitute into the remaining equations. This results in a reducedmodel.

$\begin{matrix}{\frac{dc}{dt} = {{\gamma(c)}.c}} & \left( {5a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{eYFPi}\rbrack}{dt} = {{a_{YFP}.{h(c)}} - {\left( {d_{YFP} + \mu_{YFP} + {\gamma(c)}} \right)\lbrack{eYFP}\rbrack}}} & \left( {5b} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{eYFP}\rbrack}{dt} = {{\mu_{YFP}\lbrack{eYFPi}\rbrack} - {\left( {d_{YFP} + {\gamma(c)}} \right)\lbrack{eYFP}\rbrack}}} & \left( {5c} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{eCFPi}\rbrack}{dt} = {{a_{CFP}.{h(c)}} - {\left( {d_{CFP} + \mu_{CFP} + {\gamma(c)}} \right)\lbrack{eYFPi}\rbrack}}} & \left( {5d} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{eCFP}\rbrack}{dt} = {{\mu_{CFP}\lbrack{eCFPi}\rbrack} - {\left( {d_{CFP} + {\gamma(c)}} \right)\lbrack{eCFP}\rbrack}}} & \left( {5e} \right)\end{matrix}$ where $\begin{matrix}{{a_{YFP} = {\frac{a_{r0011}.N.k_{b0034}}{d_{m\_ b0034\_{eYFP}} + {\gamma(c)}} \approx \frac{a_{r0011}.N.k_{b0034}}{d_{m\_ b0034\_{eYFP}}}}},{a_{CFP} = {\frac{a_{r0011}.N.k_{b0034}}{d_{m\_ b0034\_{eCFP}} + {\gamma(c)}} \approx \frac{a_{r0011}.N.k_{b0034}}{d_{m_{b0034_{eCFP}}}}}}} & (6)\end{matrix}$

Here we remove γ(c) from the denominator because dilution will normallybe slower than mRNA degradation, and making this simplification resultsin α_(YFP) and α_(CFP) being constant. Finally, we note here thatdespite simplifying the parameterisation (removing α_(r0011), k_(b0034),d_(m_b0034_eYFP), and d_(m_b0034_eCFP)), we should keep in mind thatthere is a dependency of aYFP and aCFP on the biological parts in eachcassette.

Assumption 2: By assuming that fluorescent protein maturation is fast,we can remove the equations for the concentration of immaturefluorescent proteins. Let [YFP] be the sum of the concentrations ofimmature and mature fluorescent proteins. Adding (5b) to (5c), we obtain

$\begin{matrix}\begin{matrix}{{\frac{d}{dt}\left( {\lbrack{eYFPi}\rbrack + \lbrack{eYFP}\rbrack} \right)} = {{a_{YFP}.{h(c)}} - {\left( {d_{YFP} + {\gamma(c)}} \right)\left( {\lbrack{eYFPi}\rbrack + \lbrack{eYPF}\rbrack} \right)}}} \\{\left. \Rightarrow\frac{d\lbrack{YFP}\rbrack}{dt} \right. = {{a_{YFP}.{h(c)}} - {\left( {d_{YFP} + {y(c)}} \right)\lbrack{YFP}\rbrack}}}\end{matrix} & (7)\end{matrix}$

and similarly for CFP. If we assume that the immature form is instantlyconverted to the mature form, then the concentration of the mature formis equal to the total concentration. Therefore, the resultant system ofequations is given by

$\begin{matrix}{\frac{dc}{dy} = {{\gamma(c)}.c}} & \left( {8a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{YFP}\rbrack}{dt} = {{a_{YFP}.{h(c)}} - {\left( {d_{YFP} + {y(c)}} \right)\lbrack{YFP}\rbrack}}} & \left( {8b} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{CFP}\rbrack}{dt} = {{a_{CFP}.{h(c)}} - {\left( {d_{CFP} + {y(c)}} \right)\lbrack{CFP}\rbrack}}} & \left( {8c} \right)\end{matrix}$

Parameter Prior for Target Characterisation Phase

We infer the PRPR model parameters by using (uninformative) uniformpriors on the previously uncharacterised parameters α_(CFP), α_(YFP),d_(CFP) and d_(YFP), and (informative) truncated Gaussian priors on α₄₈₀and α₅₃₀, with mean and standard deviation taken from the marginalposteriors of the Auto circuit characterisation. We also useuninformative priors on the B_(k) ^(back) parameters, withoutpropagating their marginal posteriors from previous circuits.

Characterisation Experiment

We measured the PRPR circuit in response to varying concentrations ofchloramphenacol, an inhibitor of protein translation. By inhibitingprotein translation, we directly alter the gene expression capacity termin our models, enabling us to test whether h(c) can capture such aneffect implicitly.

Example 3 Dynamic Characterisation of the Acyl Homoserine Lactone (AHL)Double Receiver (DR)

In this example we consider the dynamic characterisation of the AHLdouble receiver device introduced by Grant et al., (Mol Syst Biol.(2016) 12: 849). In this device, two variations of the wild-type P_(Lux)promoter, P_(OLux) and P_(OLas), were engineered to bind preferentiallyto activated luxR and lasR complexes respectively. As LuxR favoursbinding of C6 homoserine lactone (3OC6-HSL) and LasR favours binding ofC12 homoserine lactone (3OC12-HSL), optimised expression of LuxR andLasR can lead to near-orthogonal intracellular detection of C6-HSL andC12-HSL. The double receiver device was originally measured withP_(OLux) upstream of the coding sequence for cyan fluorescent protein(CFP), and P_(OLas) upstream of the coding sequence for yellowfluorescent protein (YFP). A plasmid containing all of this machinerywas then inserted into cells chromosomally expressing monomeric redfluorescent protein (mRFP1), which could be used as a ratiometriccontrol. The model we consider here builds on previouslydefined/parameterized models, but incorporates two importantdifferences:

-   -   1. The concentrations of luxR and lasR become dynamic        quantities, and are affected by dilution; and    -   2. luxR and lasR-based regulators (bound to HSLs) might bind to        more than one promoter

Example 3.1

We start by considering each HSL denoted by subscript kϵ{6, 12}, luxRand lasR, and the presence of two different populations of PLuxpromoters. Therefore, we start with the following equations:

$\begin{matrix}{\frac{d\lbrack R\rbrack}{dt} = {a_{R} + {u_{Rk}\left\lbrack R_{k} \right\rbrack} - {\lbrack R\rbrack\left( {\gamma + {b_{Rk}\left\lbrack C_{k} \right\rbrack}} \right)}}} \\{\frac{d\lbrack S\rbrack}{dt} = {a_{S} + {u_{Sk}\left\lbrack S_{k} \right\rbrack} - {\lbrack S\rbrack\left( {\gamma + {b_{Sk}\left\lbrack C_{k} \right\rbrack}} \right)}}} \\{\frac{d\left\lbrack R_{k} \right\rbrack}{dt} = {{{b_{Rk}\lbrack R\rbrack}\left\lbrack C_{k} \right\rbrack} + {2{u_{Dk}\left\lbrack D_{k} \right\rbrack}} - {\left\lbrack R_{k} \right\rbrack\left( {\gamma + u_{Rk} + {2{b_{Dk}\left\lbrack R_{k} \right\rbrack}}} \right)}}} \\{\frac{d\left\lbrack S_{k} \right\rbrack}{dt} = {{{b_{Sk}\lbrack S\rbrack}\left\lbrack C_{k} \right\rbrack} + {2{u_{Ek}\left\lbrack E_{k} \right\rbrack}} - {\left\lbrack S_{k} \right\rbrack\left( {\gamma + u_{Sk} + {2{b_{Ek}\left\lbrack S_{k} \right\rbrack}}} \right)}}} \\{\frac{d\left\lbrack D_{k} \right\rbrack}{dt} = {{b_{Dk}\left\lbrack R_{k} \right\rbrack}^{2} + {u_{GDk}{\sum\limits_{i}\left\lbrack {G_{i}.D_{k}} \right\rbrack}} - {\left\lbrack D_{k} \right\rbrack\left( {u_{Dk} + {b_{GDk}{\sum\limits_{i}\left\lbrack G_{i} \right\rbrack}}} \right)}}} \\{\frac{d\left\lbrack E_{k} \right\rbrack}{dt} = {{b_{Ek}\left\lbrack S_{k} \right\rbrack}^{2} + {u_{GEk}{\sum\limits_{i}\left\lbrack {G_{i}.E_{k}} \right\rbrack}} - {\left\lbrack E_{k} \right\rbrack\left( {u_{Ek} + {b_{GEk}{\sum\limits_{i}\left\lbrack G_{i} \right\rbrack}}} \right)}}} \\{\frac{d\left\lbrack {G_{i}.D_{k}} \right\rbrack}{dt} = {{{b_{GDk}\left\lbrack G_{i} \right\rbrack}\left\lbrack D_{k} \right\rbrack} - {u_{GDk}\left\lbrack {G_{i}.D_{k}} \right\rbrack}}} \\{\frac{d\left\lbrack {G_{i}.E_{k}} \right\rbrack}{dt} = {{{b_{GEk}\left\lbrack G_{i} \right\rbrack}\left\lbrack E_{k} \right\rbrack} - {u_{GEk}\left\lbrack {G_{i}.E_{k}} \right\rbrack}}} \\{\frac{d\left\lbrack G_{i} \right\rbrack}{dt} = {{u_{GDk}\left\lbrack {G_{i}.D_{k}} \right\rbrack} + {u_{GEk}\left\lbrack {G_{i}.E_{k}} \right\rbrack} - {\left\lbrack G_{i} \right\rbrack\left( {{b_{GDk}\left\lbrack D_{k} \right\rbrack} + {b_{GEk}\left\lbrack E_{k} \right\rbrack}} \right)}}}\end{matrix}$

Here, R_(k) refers to a luxR-HSL heterodimer and D_(k) is the tetramericcomplex comprising two R_(k) complexes (no cross-binding of C6 and C12dimers). Similarly, S_(k) and E_(k) are equivalent lasR-HSL complexes.

Solving most of the above system equal to zero, we obtain the quasi-

${equilibrium}{{\left\lbrack {G_{i}.D_{k}} \right\rbrack^{*} = {{K_{GDk}\left\lbrack G_{i} \right\rbrack}\left\lbrack D_{k} \right\rbrack}},{\left\lbrack D_{k} \right\rbrack^{*} = {K_{Dk}\left\lbrack R_{k} \right\rbrack}^{2}},{\left\lbrack R_{k} \right\rbrack^{*} = {{K_{Rk}\lbrack R\rbrack}\left\lbrack C_{k} \right\rbrack}}}{Where}{{K_{Rk} = \frac{b_{Rk}}{\gamma + \mu_{Rk}}},{K_{Dk} = {{\frac{b_{Dk}}{\mu_{Dk}}{and}K_{GDk}} = {\frac{b_{GDk}}{\mu_{GDk}}.}}}}$

Therefore (also symmetry of R and S),

$\begin{matrix}{\left\lbrack {G_{i}.D_{k}} \right\rbrack^{*} = {K_{GDk}{K_{Dk}\left( \frac{{K_{Rk}\left\lbrack C_{k} \right\rbrack}r}{1 + {K_{Rk}\left\lbrack C_{k} \right\rbrack}} \right)}^{2}}} \\{\left\lbrack {G_{i}.E_{k}} \right\rbrack^{*} = {K_{GEk}{K_{Ek}\left( \frac{{K_{Sk}\left\lbrack C_{k} \right\rbrack}s}{1 + {K_{Sk}\left\lbrack C_{k} \right\rbrack}} \right)}^{2}}}\end{matrix}$

where the new K's are defined as above, and s=n_(S)/γ.

By taking advantage of the conservation law[G_(i)]+[G_(i)·D₆]+[G_(i)·D₁₂]+[G_(i)·E₆]+[G_(i)·E₁₂]=N, we can derivethe rate of production of mRNA as a function of [R], [S][C₆] and [C₂] as

$\frac{a_{0} + {a_{Rk}{K_{GRk}\left( \frac{K_{Rk}C_{k}r}{1 + {K_{Rk}C_{k}}} \right)}^{2}} + {a_{Sk}{K_{GSk}\left( \frac{K_{Sk}C_{k}s}{1 + {K_{Sk}C_{k}}} \right)}^{2}}}{1 + {K_{GRk}\left( \frac{K_{Rk}C_{k}r}{1 + {K_{Rk}C_{k}}} \right)}^{2} + {K_{GSk}\left( \frac{K_{Sk}C_{k}s}{1 + {K_{Sk}C_{k}}} \right)}^{2}}{{{where}K_{GRk}} = {{K_{GDk}K_{Dk}{and}K_{GSk}} = {K_{GEk}{K_{Ek}.}}}}$

As before, we replace the exponent 2 of the HSL concentration with aparameter to-be-inferred, giving

$\frac{a_{0} + {a_{Rk}K_{GRk}{r^{2}\left( \frac{K_{Rk}C_{k}}{1 + {K_{Rk}C_{k}}} \right)}^{n}} + {a_{Sk}K_{GSk}{s^{2}\left( \frac{K_{Sk}C_{k}}{1 + {K_{Sk}C_{k}}} \right)}^{n}}}{1 + {K_{GRk}{r^{2}\left( \frac{K_{Rk}C_{k}}{1 + {K_{Rk}C_{k}}} \right)}^{n}} + {K_{GSk}{s^{2}\left( \frac{K_{Sk}C_{k}}{1 + {K_{Sk}C_{k}}} \right)}^{n}}}$

Accordingly, we obtain the following system of equations

$\begin{matrix}{\frac{dx}{dt} = {{rx}\left( {1 - \frac{x}{K}} \right)}} \\{\frac{d\lbrack{luxR}\rbrack}{dt} = {{a_{R}.{\xi(x)}} - {\left( {d_{R} + \gamma} \right)\lbrack{luxR}\rbrack}}} \\{\frac{d\lbrack{lasR}\rbrack}{dt} = {{a_{S}.{\xi(x)}} - {\left( {d_{R} + \gamma} \right)\lbrack{lasR}\rbrack}}} \\{\frac{d\lbrack{CFP}\rbrack}{dt} = {{a_{C}.{\xi(x)}.{f_{76}\left( {\left\lbrack C_{6} \right\rbrack,\left\lbrack C_{12} \right\rbrack,\lbrack{luxR}\rbrack,\lbrack{lasR}\rbrack} \right)}} - {\left( {d_{CFP} + \gamma} \right)\lbrack{CFP}\rbrack}}} \\{\frac{d\lbrack{YFP}\rbrack}{dt} = {{a_{Y}.{\xi(x)}.{f_{81}\left( {\left\lbrack C_{6} \right\rbrack,\left\lbrack C_{12} \right\rbrack,\lbrack{luxR}\rbrack,\lbrack{lasR}\rbrack} \right)}} - {\left( {d_{YFP} + \gamma} \right)\lbrack{YFP}\rbrack}}}\end{matrix}$ where $f_{p({r,s,{c_{6}c_{12}}})} = \frac{\begin{matrix}{a_{0}^{(p)} + {a_{1}^{R}K_{GR}^{(p)}r^{2}\frac{{K_{R6}^{n}c_{6}^{n}} + {K_{R12}^{n}c_{12}^{n}}}{\left( {1 + {K_{R6}c_{6}} + {K_{R12}c_{12}}} \right)^{n}}} +} \\{a_{1}^{S}K_{GS}^{(p)}s^{2}\frac{{K_{S6}^{n}c_{6}^{n}} + {K_{S12}^{n}c_{12}^{n}}}{\left( {1 + {K_{S6}c_{6}} + {K_{S12}c_{12}}} \right)^{n}}}\end{matrix}}{\begin{matrix}{1 + {K_{GR}^{(p)}r^{2}\frac{{K_{R6}^{n}c_{6}^{n}} + {K_{R12}^{n}c_{12}^{n}}}{\left( {1 + {K_{R6}c_{6}} + {K_{R12}c_{12}}} \right)^{n}}} +} \\{K_{GS}^{(p)}s^{2}\frac{{K_{S6}^{n}c_{6}^{n}} + {K_{S12}^{n}c_{12}^{n}}}{\left( {1 + {K_{S6}c_{6}} + {K_{S12}c_{12}}} \right)^{n}}}\end{matrix}}$

Example 3.2 Model Definition

As the modelling philosophy we are considering here describes dynamicsof proteins, the model of DR circuits in this paper differs from modeldescribed previously (Grant et al., Mol. Sys. Biol. (2016) 12:849) asthe concentrations of luxR and lasR become dynamic quantities affectedby growth dilution. Furthermore, in Grant et al., we parameterised thedouble receiver by first parameterising circuits that only have one PLuxpromoter. Here, we seek to parameterise the DR circuits directly, byinferring parameters against YFP and CFP measurements simultaneously. Assuch, the model must consider that luxR and lasR-based regulators (boundto HSLs) might bind to more than one (PLux) promoter. FIG. 11 shows thedesign of the double receiver circuit.

We denote by C_(k) the HSL molecule with length k carbon chain, and byG_(i) the pLux76 and pLas81 promoters. Then we can specify all of thereactions between the HSLs, luxR and lasR, and eventual binding oftranscriptional regulators to the P_(OLux) and P_(OLas) promoters.

$\begin{matrix}\begin{matrix}{{{luxR} + C_{k}}\underset{u_{Rk}}{\overset{b_{Rk}}{\rightleftharpoons}}{{luxR} - C_{k}}} \\{{{lasR} + C_{k}}\underset{u_{Sk}}{\overset{b_{Sk}}{\rightleftharpoons}}{{lasR} - C_{k}}} \\{{{luxR} - C_{k} + {luxR} - C_{k}}\underset{u_{Dk}}{\overset{b_{Dk}}{\rightleftharpoons}}{{luxR} - C_{k} - {luxR} - C_{k}}} \\{{{lasR} - C_{k} + {lasR} - C_{k}}\underset{u_{Ek}}{\overset{b_{Ek}}{\rightleftharpoons}}{{lasR} - C_{k} - {lasR} - C_{k}}} \\{{G_{i} + {luxR} - C_{k} - {luxR} - C}\underset{u_{GiDk}}{\overset{b_{GiDk}}{\rightleftharpoons}}{G_{i} - {luxR} - C_{k} - {luxR} - C_{k}}} \\{{G_{i} + {lasR} - C_{k} - {lasR} - C_{k}}\underset{u_{GiEk}}{\overset{b_{GiEk}}{\rightleftharpoons}}{G_{i} - {lasR} - C_{k} - {lasR} - C_{k}}}\end{matrix} & (9)\end{matrix}$

Constitutive expression of luxR, lasR is described by

$\begin{matrix}\begin{matrix}{\varnothing\overset{a_{luxR}}{\longrightarrow}{luxR}} & {\varnothing\overset{a_{lasR}}{\longrightarrow}{lasR}}\end{matrix} & (10)\end{matrix}$

Inducible expression of eCFP and eYFP by P_(OLux) and P_(OLas)respectively is described by

$\begin{matrix}\begin{matrix}{{G_{{pLux}76}\overset{a0_{{pLux}76}}{\longrightarrow}G_{{pLux}76}} + {{m\_ b0034}{\_ eCFP}}} \\{G_{{pLux}76} - {luxR} - C_{k} - {luxR} - {C_{k}\overset{a1_{{pLux}76}}{\longrightarrow}G_{{pLux}76}} - {luxR} - C_{k} - {luxR} -} \\{C_{k} + {{m\_ b0034}{\_ eCFP}}} \\{G_{{pLux}76} - {lasR} - C_{k} - {lasR} - {C_{k}\overset{a1_{{pLux}76}}{\longrightarrow}G_{{pLux}76}} - {lasR} - C_{k} - {lasR} -} \\{C_{k} + {{m\_ b0034}{\_ eCFP}}} \\{{G_{{pLas}81}\overset{a0_{{pLas}81}}{\longrightarrow}G_{{pLas}81}} + m_{b0034_{eCFP}}} \\{G_{{pLas}81} - {luxR} - C_{k} - {luxR} - {C_{k}\overset{a0_{{pLas}81}}{\longrightarrow}G_{{pLas}81}} - {luxR} - C_{k} - {luxR} -} \\{C_{k} + {{m\_ b0034}{\_ eYFP}}} \\{G_{{pLas}81} - {lasR} - C_{k} - {lasR} - {C_{k}\overset{a1_{{pLas}81}}{\longrightarrow}G_{{pLas}81}} - {lasR} - C_{k} - {lasR} -} \\{C_{k} + {{m\_ b0034}{\_ eCFP}}}\end{matrix} & (11)\end{matrix}$

The full DR circuit comprises reactions (9), (10) and (11), and the samereactions as shown for the PLPL circuit that describe mRNA translationfor eCFP and eYFP.

To produce a simplified ODE model amenable to parameter inference, westart with the equations describing luxR and lasR protein, theircomplexes involving C₆ and C₁₂, and the bound/unbound promoters.

Applying also Assumption 1 from before, we arrive a set of equations forthe non-mRNA species as

$\begin{matrix}{\frac{dc}{dt} = {{\gamma(c)}.c}} & \left( {12a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{luxR}\rbrack}{dt} = {a_{R} + {u_{Rk}\left\lbrack {{luxR} - C_{k}} \right\rbrack} - {\lbrack{luxR}\rbrack\left( {{\gamma(c)} + d_{R} + {b_{Rk}\left\lbrack C_{k} \right\rbrack}} \right)}}} & \left( {12b} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{lasR}\rbrack}{dt} = {a_{S} + {u_{Sk}\left\lbrack {{lasR} - C_{k}} \right\rbrack} - {\lbrack{lasR}\rbrack\left( {{\gamma(c)} + d_{S} + {b_{Sk}\left\lbrack C_{k} \right\rbrack}} \right)}}} & \left( {12c} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left\lbrack {{luxR} - C_{k}} \right\rbrack}{dt} = {{{b_{Rk}\lbrack{luxR}\rbrack}\left\lbrack c_{k} \right\rbrack} + {2{u_{Dk}\left\lbrack {{luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack}} -}} & \left( {12d} \right)\end{matrix}$[luxR − C_(k)](γ(c) + d_(R)¹ + u_(Rk) + 2b_(Dk)[luxR − C_(k)])$\begin{matrix}{\frac{d\left\lbrack {{lasR} - C_{k}} \right\rbrack}{dt} = {{{b_{Sk}\lbrack{lasR}\rbrack}\left\lbrack c_{k} \right\rbrack} + {2{u_{Ek}\left\lbrack {{lasR} - C_{k} - {lasR} - C_{k}} \right\rbrack}} -}} & \left( {12e} \right)\end{matrix}$[lasR − C_(k)](γ(c) + d_(S)¹ + u_(Sk) + 2b_(Ek)[lasR − C_(k)])$\begin{matrix}{\frac{d\left\lbrack {{luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack}{dt} = {{b_{Dk}\left\lbrack {{luxR} - C_{k}} \right\rbrack}^{2} +}} & \left( {12f} \right)\end{matrix}$u_(GiDk)∑_(i)[G_(i) − luxR − C_(k) − luxR − C_(k)]… − [luxR − C_(k) − luxR − C_(k)](γ(c) + d_(R)¹ + u_(Dk) + b_(GiDk)∑_(i)[G_(i)]) $\begin{matrix}{\frac{d\left\lbrack {{lasR} - C_{k} - {lasR} - C_{k}} \right\rbrack}{dt} = {{b_{Ek}\left\lbrack {{lasR} - c_{k}} \right\rbrack}^{2} +}} & \left( {12g} \right)\end{matrix}$u_(GiEk)∑_(i)[G_(i) − lasR − C_(k) − lasR − C_(k)]… − [lasR − C_(k) − lasR − C_(k)](γ(c) + d_(R)¹ + u_(Ek) + b_(GiEk)∑_(i)[G_(i)]) $\begin{matrix}{\frac{d\left\lbrack {G_{i} - {luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack}{dt} =} & \left( {12h} \right)\end{matrix}$ b_(GiDk)[G_(i)][luxR − C_(k) − luxR − C_(k)]−(γ(c) + u_(GiDk))[G_(i) − luxR − C_(k) − luxR − C_(k)] $\begin{matrix}{\frac{d\left\lbrack {G_{i} - {lasR} - C_{k} - {lasR} - C_{k}} \right\rbrack}{dt} =} & \left( {12i} \right)\end{matrix}$ b_(GiEk)[G_(i)][lasR − C_(k) − lasR − C_(k)]−(γ(c) + u_(GiEk))[G_(i) − lasR − C_(k) − lasR − C_(k)] $\begin{matrix}{\frac{d\left\lbrack G_{i} \right\rbrack}{dt} = {{{\gamma(c)}.n_{G}} + {u_{GiDk}\left\lbrack {G_{i} - {luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack} +}} & \left( {12j} \right)\end{matrix}$u_(GiEk)[G_(i) − lasR − C_(k) − lasR − C_(k)]… − [G_(i)](γ(c)+b_(GiDk)[luxR − C_(k) − luxR − C_(k)] + b_(GiEk)[lasR − C_(k) − lasR − C_(k)])

Here, luxR-C_(k) refers to a luxR-HSL heterodimer andluxR-C_(k)-luxR-C_(k) is the tetrameric complex comprising twoluxR-C_(k) complexes (no cross-binding of C6 and C12 dimers). Similarly,lasR-C_(k) and lasR-C_(k)-lasR-C_(k) are equivalent lasR-HSL complexes.The luxR and lasR proteins are assumed to be stabilised by the bindingof signal. Therefore, rates d¹ _(j)<d_(j)(j=R,S) are the slower ratesfor signal-bound luxR/lasR proteins.

Solving most of the above system equal to zero, we obtain thequasi-equilibrium

$\begin{matrix}{\left\lbrack {G_{i} - {luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack^{*} = {{K_{GDk}\left\lbrack G_{i} \right\rbrack}\left\lbrack {{luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack}} & \left( {13a} \right)\end{matrix}$ $\begin{matrix}{\left\lbrack {{luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack^{*} = {K_{Dk}\left\lbrack {{luxR} - C_{k}} \right\rbrack}^{2}} & \left( {13b} \right)\end{matrix}$ $\begin{matrix}{\left\lbrack {{luxR} - C_{k}} \right\rbrack^{*} = {{{K_{Rk}\left\lbrack G_{i} \right\rbrack}\lbrack{luxR}\rbrack}\left\lbrack C_{k} \right\rbrack}} & \left( {13c} \right)\end{matrix}$ where${K_{Rk} = \frac{b_{Rk}}{\gamma + u_{Rk}}},{K_{Dk} = {{\frac{b_{Dk}}{u_{Dk}}{and}K_{GiDk}} = \frac{b_{GiDk}}{u_{GiDk}}}}$

Therefore (also symmetry of luxR and lasR),

[G _(i)-luxR-C _(k)-luxR-C _(k) ]*=K _(GiDk) K _(Dk)(K _(Rk) [C_(k)][luxR])²  (14a)

[G _(i)-lasR-C _(k)-lasR-C _(k) ]*=K _(GiEk) K _(Ek)(K _(Sk) [C_(k)][lasR])²  (14b)

where the new K's are defined as above.

In the reduced system, total luxR is described by the equations

$\begin{matrix}{\frac{{d\lbrack{luxR}\rbrack}_{T}}{dt} = {{\frac{d\lbrack{luxR}\rbrack}{dt} + {{\sum}_{k}\left( {\frac{d\left\lbrack {{luxR} - C_{k}} \right\rbrack}{dt} + {2\frac{d\left\lbrack {{luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack}{dt}} + {2{\sum}_{i}\frac{d\left\lbrack {G_{i} - {luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack}{dt}}} \right)}} = {a_{R} - {\gamma\lbrack{luxR}\rbrack} - {d_{R}\lbrack{luxR}\rbrack}}}} & \left( {15a} \right)\end{matrix}$ $\begin{matrix}{{- {\sum}_{k}}\left( {{\left( {\gamma + d_{R}^{1}} \right)\left\lbrack {{luxR} - C_{k}} \right\rbrack} + {\left( {\gamma + d_{R}^{1}} \right)\left\lbrack {{luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack} + {{\sum}_{i}\left\lbrack {G_{i} - {luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack}} \right)} & \left( {15b} \right)\end{matrix}$ $\begin{matrix}{= {a_{R} - {\gamma\lbrack{luxR}\rbrack}_{T} - {d_{R}^{1}\lbrack{luxR}\rbrack}_{T} - {\left( {d_{R} - d_{R}^{1}} \right)\lbrack{luxR}\rbrack}}} & \left( {15c} \right)\end{matrix}$

The final term needs careful attention, as we expect a differencebetween d_(R) and d_(R) ¹. Nevertheless, the obvious approximation is toignore this difference, and model [luxR]_(T) explicitly with nodependence on [R]. We can write down expressions for the fraction of[luxR]_(T) that is bound to signal, dimerized, etc., using theequilibrium relationships above. E.g

$\begin{matrix}{\lbrack{luxR}\rbrack_{T} = {\lbrack{luxR}\rbrack + {\sum\limits_{k}\left( {\left\lbrack {{luxR} - C_{k}} \right\rbrack + {2\left\lbrack {{luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack} +} \right.}}} \\\left. {}{2{\sum\limits_{i}\left\lbrack {G_{i} - {luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack}} \right) \\{= {\lbrack{luxR}\rbrack + {\sum\limits_{k}\left( {{{K_{Rk}\lbrack{luxR}\rbrack}\left\lbrack C_{k} \right\rbrack} + {2K_{Dk}{{K_{Rk}^{2}\lbrack{luxR}\rbrack}^{2}\left\lbrack C_{k} \right\rbrack}^{2}} +} \right.}}} \\\left. {}{\sum\limits_{i}{2{K_{GiDk}\left\lbrack G_{i} \right\rbrack}K_{Dk}{{K_{Rk}^{2}\lbrack{luxR}\rbrack}^{2}\left\lbrack C_{k} \right\rbrack}^{2}}} \right)\end{matrix}$

When C_(k) is low, total LuxR is closely approximated by free LuxR,[luxR]_(T)≈[luxR]. But when C_(k) is high, [luxR]_(T) should bepartitioned between the [luxR-C_(k)-luxR-C_(k)] and[G_(i)-luxR-C_(k)-luxR-C_(k)] species. Therefore, to simplify theanalysis, we propose modelling this by using the assumption

$\begin{matrix}{{\lbrack{luxR}\rbrack_{T} \approx {\lbrack{luxR}\rbrack + {\sum\limits_{k}\left\lbrack {{luxR} - C_{k}} \right\rbrack}}} = {\lbrack{luxR}\rbrack\left( {1 + {\sum\limits_{k}{K_{Rk}\left\lbrack C_{k} \right\rbrack}}} \right)}} & (16)\end{matrix}$

which still captures the saturation of luxR by C_(k), using theapproximations

$\begin{matrix}{\lbrack{luxR}\rbrack \approx {\lbrack{luxR}\rbrack_{T}.\frac{1}{1 + {{\sum}_{k}{K_{Rk}\left\lbrack C_{k} \right\rbrack}}}}} & \left( {17a} \right)\end{matrix}$ $\begin{matrix}{\left\lbrack {{luxR} - C_{k}} \right\rbrack \approx {\lbrack{luxR}\rbrack_{T}.\frac{K_{Rk}\left\lbrack C_{k} \right\rbrack}{1 + {{\sum}_{k}{K_{Rk}\left\lbrack C_{k} \right\rbrack}}}}} & \left( {17b} \right)\end{matrix}$ $\begin{matrix}{\left\lbrack {{luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack \approx {{K_{Dk}\lbrack{luxR}\rbrack}_{T}^{2}\left( \frac{K_{Rk}\left\lbrack C_{k} \right\rbrack}{1 + {{\sum}_{k}{K_{Rk}\left\lbrack C_{k} \right\rbrack}}} \right)^{2}}} & \left( {17c} \right)\end{matrix}$ $\begin{matrix}{\left\lbrack {G_{i} - {luxR} - C_{k} - {luxR} - C_{k}} \right\rbrack \approx {{{K_{GR}^{(i)}\left\lbrack G_{i} \right\rbrack}\lbrack{luxR}\rbrack}_{T}^{2}\left( \frac{K_{Rk}\left\lbrack C_{k} \right\rbrack}{1 + {{\sum}_{k}{K_{Rk}\left\lbrack C_{k} \right\rbrack}}} \right)^{2}}} & \left( {17d} \right)\end{matrix}$ where K_(GR)^((i)) = K_(GiDk)K_(Dk)

is assumed to be independent of which signal is bound (k).

The approximation also allows for saturation of G_(i). By takingadvantage of the conservation law[G_(i)]+[G_(i)·luxR-C₆-luxR-C₆]+[G₁₂-luxR-C]-luxR-C]+[G₆-lasR-C]-lasR-C]+[G₁₂-lasR-C]-lasR-C]=N_(i),we can derive the rate of production of mRNA as a function of [R]_(T),[S]_(T) [C₆] and [C₁₂]. For notational convenience we drop the squarebrackets and T subscript, which allows us to write

$\begin{matrix}{{f_{i}\left( {R,S,C_{6},C_{12}} \right)} =} & (18)\end{matrix}$$\frac{\varepsilon^{(i)} + {K_{GR}^{(i)}{R^{2}\left( \frac{K_{Rk}C_{k}}{1 + {{\sum}_{k}K_{Rk}C_{k}}} \right)}^{n_{R}}} + {K_{GS}^{(i)}{S^{2}\left( \frac{K_{Sk}C_{k}}{1 + {{\sum}_{k}K_{Sk}C_{k}}} \right)}^{n_{S}}}}{1 + {K_{GR}^{(i)}{R^{2}\left( \frac{K_{Rk}C_{k}}{1 + {{\sum}_{k}K_{Rk}C_{k}}} \right)}^{n_{R}}} + {K_{GS}^{(i)}{S^{2}\left( \frac{K_{Rk}C_{k}}{1 + {{\sum}_{k}K_{Rk}C_{k}}} \right)}^{n_{S}}}}$

where K_(G) _(i) _(-luxR) and K_(G) _(i) _(-lasR) are scaled toincorporate K_(Dk) and K_(Ek) respectively, and we introduce alternativeexponents n_(R) and n_(S). Accordingly, we obtain the following systemof equations

$\begin{matrix}{\frac{dc}{dt} = {{\gamma(c)}.c}} & \left( {19a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{luxR}\rbrack}{dt} = {{a_{R}.{h(c)}} - {\left( {d_{R} + {\gamma(c)}} \right)\lbrack{luxR}\rbrack}}} & \left( {19b} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{lasR}\rbrack}{dt} = {{a_{S}.{h(c)}} - {\left( {d_{R} + {\gamma(c)}} \right)\lbrack{lasR}\rbrack}}} & \left( {19c} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{CFP}\rbrack}{dt} = {{a_{C}.{h(c)}.{f_{76}\left( {\left\lbrack C_{6} \right\rbrack,\left\lbrack C_{12} \right\rbrack,\lbrack R\rbrack,\lbrack S\rbrack} \right)}} - {\left( {d_{CFP} + {\gamma(c)}} \right)\lbrack{CFP}\rbrack}}} & \left( {19d} \right)\end{matrix}$ $\begin{matrix}{\frac{d\lbrack{YFP}\rbrack}{dt} = {{a_{Y}.{h(c)}.{f_{81}\left( {\left\lbrack C_{6} \right\rbrack,\left\lbrack C_{12} \right\rbrack,\lbrack R\rbrack,\lbrack S\rbrack} \right)}} - {\left( {d_{YFP} + {\gamma(c)}} \right)\lbrack{YFP}\rbrack}}} & \left( {19e} \right)\end{matrix}$

where the variables R and S now represent total luxR and lasRrespectively.

Characterisation Experiment

To characterise the LuxR and LasR signalling components, we usedmeasurements of the response of four DR circuits from Grant et al.,(Mol. Systems Biol., 2016, 12(1):849) to treatment with C6-HSL andC12-HSL over 3-fold dilutions. The maximum LuxR and LasR productionrates were normalized to the values corresponding to the Pcat promoters,as done previously, thus setting the scale for unobserved concentrationsof LuxR and LasR.

We used (uninformative) uniform priors on the previously uncharacterisedparameters, and (informative) truncated Gaussian priors on α₄₈₀, α₅₃₀,d_(CFP) and d_(YFP) with mean and standard deviation taken from themarginal posteriors of the PRPR circuit characterisation. We also useuninformative priors on the B_(k) ^(back) parameters, withoutpropagating their marginal posteriors from previous circuits.

Example 4 Dynamic Characterisation of Two Relay Signalling Devices

To test more directly the characterisation of dynamics in engineeredgene expression circuits, we constructed two relay signalling devices.The first responds to C6-HSL via binding of constitutively expressedLuxR and activation of the engineered P_(OLux) promoter, whichsynthesizes the C12-HSL-producing LasI enzyme and CFP.C12-HSL is thendetected via binding of constitutively expressed LasR and activation ofthe engineered P_(OLas) promoter, which synthesizes YFP. Accordingly, wecan monitor the promoter activity of both stages of the relay.Chromosomally integrated RFP provides a ratiometric control, as before.The second device performs the receive-send receive in the other order,sensing C12-HSL via P_(OLas), which synthesizes the C6-HSL-producingLuxI enzyme (and YFP), with C6-HSL then detected via P_(OLux)-CFP. Thisarrangement is shown in FIGS. 2A and 2B.

The relay circuit network diagrams are also provided in FIGS. 12A and12B.

Building on the above approach, we can write down an equation for theintracellular production of LuxI or LasI. Each then synthesizes C6-HSLand C12-HSL respectively, which we model as a linear function of [luxI]or [lasI]. In the models described above, it was assumed that transportbetween the intracellular and extracellular components was fast. Asthere was no de novo synthesis of HSL, and often a large extracellularexcess, the system dynamics would unlikely be critically sensitive tothis approximation, as any lag could be accounted for by adjustments inthe other parameters. However, in the relay circuits, where HSLs arebeing produced, the transport between intracellular and extracellularcompartments must be considered more carefully. For the C6-LuxI-C12relay, the equations for the mass of each molecule in the intracellular(denoted by subscript i) and extracellular (denoted by subscript e)compartments is given by

$\frac{d\left( {V_{i}.\lbrack{lasI}\rbrack} \right)}{dt} = {{V_{i}.{\xi(x)}.\rho_{L}.{f_{76}\left( {\left\lbrack C_{6} \right\rbrack,\left\lbrack C_{12} \right\rbrack,\lbrack{luxR}\rbrack,\lbrack{lasR}\rbrack} \right)}} - {V_{i}.{d_{L}\lbrack{lasI}\rbrack}}}$${\frac{d\left( {V_{i}.\left\lbrack C_{12} \right\rbrack_{i}} \right)}{dt} = {{V_{i}.k_{C12}.\lbrack{lasI}\rbrack} + {\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{e} - \left\lbrack C_{12} \right\rbrack_{i}} \right)}}}{\frac{d\left( {V_{e}.\left\lbrack C_{12} \right\rbrack_{e}} \right)}{dt} = {\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{i} - \left\lbrack C_{12} \right\rbrack_{e}} \right)}}$

where ρ_(L) is the relative production rate of lasI (relative to CFPproduction; see above), and k_(C12) is the rate of C12-HSL synthesis bylasI. To arrive at a set of equations for the derivatives of theconcentrations of each molecule (in each compartment), we expand theleft-hand sides using the chain rule. Then, defining

$\gamma:={\frac{1}{V_{i}}\frac{{dV}_{i}}{dt}}$

as the specific growth rate, or dilution factor, assuming thatV_(e)=V_(tot)−V_(i), and rescaling [lasI] according to [lasI]=L/ρ_(L) weobtain

${\frac{dL}{dt} = {{{\xi(x)}.{f_{76}\left( {\left\lbrack C_{6} \right\rbrack,\left\lbrack C_{12} \right\rbrack,\lbrack{luxR}\rbrack,\lbrack{lasR}\rbrack} \right)}} - {\left( {d_{L} + \gamma} \right)L}}}{\frac{{d\left\lbrack C_{12} \right\rbrack}_{i}}{dt} = {{k_{C12}^{\prime}.L} + {\frac{\eta_{12}}{V_{i}}\left( {\left\lbrack C_{12} \right\rbrack_{e} - \left\lbrack C_{12} \right\rbrack_{i}} \right)} - {\gamma\left\lbrack C_{12} \right\rbrack}_{i}}}{\frac{{d\left\lbrack C_{12} \right\rbrack}_{e}}{dt} = {{\frac{\eta_{12}}{V_{tot} - V_{i}}\left( {\left\lbrack C_{12} \right\rbrack_{i} - \left\lbrack C_{12} \right\rbrack_{e}} \right)} + \frac{\gamma V_{i}}{V_{tot} - V_{i}}}}{{{where}k_{C12}^{\prime}} = {k_{C12}/{\rho_{L}.}}}$

As we had no reliable estimates of HSL transport rates a priori, westarted by using the model that combines (19) with (16). However, wealso considered the assumption that HSL transport is fast, resulting inthe relations [Ck]e=[Ck] for k=6, 12. By summing equations (18b) and(18c), we obtain

$\frac{d\left( {V_{tot}.\left\lbrack C_{12} \right\rbrack} \right)}{dt} = {{\frac{d\left( {V_{i}.\left\lbrack C_{12} \right\rbrack_{i}} \right)}{dt} + \frac{d\left( {V_{e}.\left\lbrack C_{12} \right\rbrack_{e}} \right)}{dt}} = {V_{i}.k_{C12}.\lbrack{lasI}\rbrack}}$

which when rearranged gives an alternative model (19) as

${\frac{dL}{dt} = {{{\xi(x)}.{f_{76}\left( {\left\lbrack C_{6} \right\rbrack,\left\lbrack C_{12} \right\rbrack,\lbrack{luxR}\rbrack,\lbrack{lasR}\rbrack} \right)}} - {\left( {d_{L} + \gamma} \right)L}}}{\frac{d\left\lbrack C_{12} \right\rbrack}{dt} = {\frac{V_{i}}{V_{tot}}.k_{C12}^{\prime}.L}}$

with k′_(C12) and L as defined above.

Example 5 Inducible Expression (PBad)

The arabinose-PBad system is commonly used to control expression insynthetic gene circuits. The PBad promoter is part of the arabinoseoperon, which also involves the araC protein. When arabinose is added tocell cultures at different concentrations, it increases expression of adownstream transcript. In this study, we have used PBad to control AHLlactonase expression (see further below).

To characterise the relationship between PBad and arabinose in vivo, weplaced the coding sequence for eYFP downstream of PBad (as shown in FIG.13 ) and measured the response to varying concentrations of arabinose.

Example 6 Dynamic Characterisation of AHL Lactonase (AiiA)

AHL molecules can be degraded by AHL lactonase enzymes. The mostcommonly used lactonase in synthetic biology applications is AiiA(Danino et al., Nature, 2010, 463(7279): 326-330; Chen et al., Science,2015, 349(6251): 986-989), which originates with Bacillus thurigensis.AiiA is mostly non-specific; catalytic activity has been measuredagainst a large variety of AHLs with varying length carbon chains (Liuet al., Biochemistry, 2008, 47(29): 7706-7714).

To characterise the AiiA lactonase activity in a synthetic gene circuit,we expressed AiiA under the control of the arabinose-inducible PBadpromoter (as shown in FIG. 3 and in FIG. 14 ). As AHLs may translocatebetween cells and the extracellular medium, and AiiA is only expressedinside cells in this strategy, we must carefully consider the volume andconcentration of these molecules over time, as before.

Assuming constitutive production, a few reactions describing theproduction, degradation and catalytic activity of AiiA can be asfollows:

$\begin{matrix}{\varnothing\begin{matrix}{a_{A}.{f_{Pbad}({Ara})}} \\\rightleftharpoons \\d_{H}\end{matrix}{AiiA}} & \left( {21a} \right)\end{matrix}$ $\begin{matrix}{{C_{6} + {{AiiA}\begin{matrix}b_{A6} \\\rightleftharpoons \\u_{A6}\end{matrix}{C_{6}.{AiiA}}}}\overset{d_{A6}}{\rightarrow}{AiiA}} & \left( {21b} \right)\end{matrix}$ $\begin{matrix}{{C_{12} + {{AiiA}\begin{matrix}b_{A12} \\\rightleftharpoons \\u_{A12}\end{matrix}{C_{12}.{AiiA}}}}\overset{d_{A12}}{\rightarrow}{AiiA}} & \left( {21c} \right)\end{matrix}$

Translating these reactions to ordinary differential equations, alongwith the equations for the double receiver module, we obtain

$\begin{matrix}{\frac{d\left( {V_{i}.\lbrack A\rbrack} \right)}{dt} = {{V_{i}.a_{A}.{f_{Pbad}({Ara})}} + {V_{i}.{\left( {u_{A6} + d_{A6}} \right)\left\lbrack {A.C_{6}} \right\rbrack}} + {V_{i}.{\left( {u_{A12} + d_{A12}} \right)\left\lbrack {A.C_{12}} \right\rbrack}}}} & \left( {22a} \right)\end{matrix}$ $\begin{matrix}{- {V_{i}.{\left( {d_{A} + {b_{A6}\left\lbrack C_{6} \right\rbrack}_{i} + {b_{A12}\left\lbrack C_{12} \right\rbrack}_{i}} \right)\lbrack A\rbrack}}} & \left( {22b} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{i}.\left\lbrack C_{6} \right\rbrack_{i}} \right)}{dt} = {{V_{i}.u_{A6}.\left\lbrack {A.C_{6}} \right\rbrack} + {\eta_{6}\left( {\left\lbrack C_{6} \right\rbrack_{e} - \left\lbrack C_{6} \right\rbrack_{i}} \right)} - {V_{i}.{\left( {d_{C} + {b_{A6}\lbrack A\rbrack}} \right)\left\lbrack C_{6} \right\rbrack}_{i}}}} & \left( {22c} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{i}.\left\lbrack C_{12} \right\rbrack_{i}} \right)}{dt} = {{V_{i}.u_{A12}.\left\lbrack {A.C_{12}} \right\rbrack} + {\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{e} - \left\lbrack C_{12} \right\rbrack_{i}} \right)} - {V_{i}.{\left( {d_{C} + {b_{A122}\lbrack A\rbrack}} \right)\left\lbrack C_{12} \right\rbrack}_{i}}}} & \left( {22d} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{i}.\left\lbrack {A.C_{6}} \right\rbrack} \right)}{dt} = {{V_{i}.{{b_{A6}\lbrack A\rbrack}\left\lbrack C_{6} \right\rbrack}_{i}} - {V_{i}.{\left( {u_{A6} + d_{A6}} \right)\left\lbrack {A.C_{6}} \right\rbrack}}}} & \left( {22e} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{i}.\left\lbrack {A.C_{12}} \right\rbrack} \right)}{dt} = {{V_{i}.{{b_{A6}\lbrack A\rbrack}\left\lbrack C_{12} \right\rbrack}_{i}} - {V_{i}.{\left( {u_{A12} + d_{A12}} \right)\left\lbrack {A.C_{12}} \right\rbrack}}}} & \left( {22f} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{e}.\left\lbrack C_{6} \right\rbrack_{e}} \right)}{dt} = {\eta_{6}\left( {\left\lbrack C_{6} \right\rbrack_{i} - \left\lbrack C_{6} \right\rbrack_{e}} \right)}} & \left( {22g} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{e}.\left\lbrack C_{12} \right\rbrack_{e}} \right)}{dt} = {\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{i} - \left\lbrack C_{12} \right\rbrack_{e}} \right)}} & \left( {22h} \right)\end{matrix}$

We can apply an equilibrium assumption to the AiiA-HSL intermediate byequating (22e) and (22f) to zero. This gives

$\begin{matrix}{\left\lbrack {A.C_{j}} \right\rbrack = {{{\frac{b_{Aj}}{\gamma + u_{Aj} + k_{Aj}}\lbrack A\rbrack}\left\lbrack C_{j} \right\rbrack} \approx {{\frac{b_{Aj}}{u_{Aj} + k_{Aj}}\lbrack A\rbrack}\left\lbrack C_{j} \right\rbrack}}} & (23)\end{matrix}$

where the approximation to suppress γ is due to the catalytic activityof AiiA being significantly faster than the dilution rate.

Therefore, depending on the concentrations [C6] and [C12], the lactonaseis bound according to the conservation equation

$\begin{matrix}{\lbrack A\rbrack = {{\left\lbrack A_{f} \right\rbrack + \left\lbrack {A.C_{6}} \right\rbrack + \left\lbrack {A.C_{12}} \right\rbrack} = {\lbrack A\rbrack.\left( {1 + {K_{A6}\left\lbrack C_{6} \right\rbrack} + {K_{A12}\left\lbrack C_{12} \right\rbrack}} \right)}}} & (24)\end{matrix}$ ${{where}K_{Aj}} = {\frac{b_{Aj}}{u_{Aj} + k_{Aj}}.}$

Alternatively, depending on the concentrations [C6] and [C12], thelactonase is bound according to the conservation equation

${\left\lbrack A_{tot} \right\rbrack = {{\lbrack A\rbrack + \left\lbrack {A.C_{6}} \right\rbrack + \left\lbrack {A.C_{12}} \right\rbrack} = {\lbrack A\rbrack.\left( {1 + {K_{A6}\left\lbrack C_{6} \right\rbrack} + {K_{A12}\left\lbrack C_{12} \right\rbrack}} \right)}}}{where}{K_{Aj} = {\frac{b_{Aj}}{u_{Aj} + d_{Aj}}.}}$

By cancelling the binding and unbinding terms with AiiA, we are leftwith the simpler set of equations

$\begin{matrix}{\frac{d\left( {V_{i}.\lbrack A\rbrack} \right)}{dt} = {{V_{i}.a_{A}.{f_{Pbad}({Ara})}} - {V_{i}.d_{A}.\lbrack A\rbrack}}} & \left( {25a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{i}.\left\lbrack C_{6} \right\rbrack_{i}} \right)}{dt} = {{\eta_{6}\left( {\left\lbrack C_{6} \right\rbrack_{e} - \left\lbrack C_{6} \right\rbrack_{i}} \right)} - {V_{i}.d_{A6}.\frac{{\lbrack A\rbrack\left\lbrack C_{6} \right\rbrack}_{i}}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack}_{i} + {K_{A12}\left\lbrack C_{12} \right\rbrack}_{i}}} - {V_{i}.d_{C}.\left\lbrack C_{j} \right\rbrack}}} & \left( {25b} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{i}.\left\lbrack C_{12} \right\rbrack_{i}} \right)}{dt} = {{\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{e} - \left\lbrack C_{12} \right\rbrack_{i}} \right)} - {V_{i}.d_{A12}.\frac{{\lbrack A\rbrack\left\lbrack C_{12} \right\rbrack}_{i}}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack}_{i} + {K_{A12}\left\lbrack C_{12} \right\rbrack}_{i}}} - {V_{i}.d_{C}.\left\lbrack C_{j} \right\rbrack}}} & \left( {25c} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{e}.\left\lbrack C_{6} \right\rbrack_{e}} \right)}{dt} = {\eta_{6}\left( {\left\lbrack C_{6} \right\rbrack_{i} - \left\lbrack C_{6} \right\rbrack_{e}} \right)}} & \left( {25d} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V_{e}.\left\lbrack C_{12} \right\rbrack_{e}} \right)}{dt} = {\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{i} - \left\lbrack C_{12} \right\rbrack_{e}} \right)}} & \left( {25e} \right)\end{matrix}$

Or, alternatively, we are left with

${\frac{d\left( {V_{i}.\left\lbrack A_{Tot} \right\rbrack} \right)}{dt} = {{V_{i}.a_{A}.{f_{Pbad}({Ara})}} - {V_{i}.d_{A}.\frac{\left\lbrack A_{Tot} \right\rbrack}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack} + {K_{A12}\left\lbrack C_{12} \right\rbrack}}}}}{\frac{d\left( {V_{i}.\left\lbrack C_{6} \right\rbrack_{i}} \right)}{dt} = {{\eta_{6}\left( {\left\lbrack C_{6} \right\rbrack_{e} - \left\lbrack C_{6} \right\rbrack_{i}} \right)} - {V_{i}.d_{A6}.\frac{{\left\lbrack A_{Tot} \right\rbrack\left\lbrack C_{6} \right\rbrack}_{i}}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack}_{i} + {K_{A12}\left\lbrack C_{12} \right\rbrack}_{i}}} - {V_{i}.d_{C}.\left\lbrack C_{j} \right\rbrack}}}{\frac{d\left( {V_{i}.\left\lbrack C_{12} \right\rbrack_{i}} \right)}{dt} = {{\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{e} - \left\lbrack C_{12} \right\rbrack_{i}} \right)} - {V_{i}.d_{A12}.\frac{{\left\lbrack A_{Tot} \right\rbrack\left\lbrack C_{12} \right\rbrack}_{i}}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack}_{i} + {K_{A12}\left\lbrack C_{12} \right\rbrack}_{i}}} - {V_{i}.d_{C}.\left\lbrack C_{j} \right\rbrack}}}$${\frac{d\left( {V_{e}.\left\lbrack C_{6} \right\rbrack_{e}} \right)}{dt} = {\eta_{6}\left( {\left\lbrack C_{6} \right\rbrack_{i} - \left\lbrack C_{6} \right\rbrack_{e}} \right)}}{\frac{d\left( {V_{e}.\left\lbrack C_{12} \right\rbrack_{e}} \right)}{dt} = {\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{i} - \left\lbrack C_{6} \right\rbrack_{e}} \right)}}$

As before, we seek a system of equations for the rates of change of themolecular concentrations, but distinguish between whether we assume thattransport is fast enough to remove from the model or not. Weadditionally augment each equation system with the double receivermodule.

Slow Transport Model

We start by expanding the derivative on the left-hand sides, thendefining

$\gamma:={\frac{1}{V_{i}}\frac{{dV}_{i}}{dt}}$

as the specific growth rate, or dilution factor. Therefore, we obtain

$\begin{matrix}{\frac{d\lbrack A\rbrack}{dt} = {{a_{A}.{f_{Pbad}({Ara})}} - {\left( {d_{A} + \gamma} \right).\lbrack A\rbrack}}} & \left( {26a} \right)\end{matrix}$ $\begin{matrix}{\frac{{d\left\lbrack C_{6} \right\rbrack}_{i}}{dt} = {{\frac{1}{V_{i}}.{\eta_{6}\left( {\left\lbrack C_{6} \right\rbrack_{e} - \left\lbrack C_{6} \right\rbrack_{i}} \right)}} - {d_{A6}.\frac{{\lbrack A\rbrack\left\lbrack C_{6} \right\rbrack}_{i}}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack}_{i} + {K_{A12}\left\lbrack C_{12} \right\rbrack}_{i}}} - {\left( {d_{C} + \gamma} \right).\left\lbrack C_{j} \right\rbrack}}} & \left( {26b} \right)\end{matrix}$ $\begin{matrix}{\frac{{d\left\lbrack C_{12} \right\rbrack}_{i}}{dt} = {{\frac{1}{V_{i}}.{\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{e} - \left\lbrack C_{12} \right\rbrack_{i}} \right)}} - {d_{A12}.\frac{{\lbrack A\rbrack\left\lbrack C_{12} \right\rbrack}_{i}}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack}_{i} + {K_{A12}\left\lbrack C_{12} \right\rbrack}_{i}}} - {\left( {d_{C} + \gamma} \right).\left\lbrack C_{j} \right\rbrack}}} & \left( {26c} \right)\end{matrix}$ $\begin{matrix}{\frac{{d\left\lbrack C_{6} \right\rbrack}_{e}}{dt} = {{\frac{1}{V_{e}}.{\eta_{6}\left( {\left\lbrack C_{6} \right\rbrack_{i} - \left\lbrack C_{6} \right\rbrack_{e}} \right)}} - {\frac{1}{V_{e}}.\frac{{dV}_{e}}{dt}}}} & \left( {26d} \right)\end{matrix}$ $\begin{matrix}{\frac{\left. {d\left\lbrack C_{12} \right\rbrack}_{e} \right)}{dt} = {{\frac{1}{V_{e}}.{\eta_{12}\left( {\left\lbrack C_{12} \right\rbrack_{i} - \left\lbrack C_{6} \right\rbrack_{e}} \right)}} - {\frac{1}{V_{e}}.\frac{{dV}_{e}}{dt}}}} & \left( {26e} \right)\end{matrix}$

Two assumptions could be applied to V_(e): (1) Ve is constant, or (2)V_(e)=V_(tot)−V_(i), where V_(tot) is a fixed total volume.

Fast Transport Model

If we assume that HSL transport is infinitely fast, then this results inthe relations [C_(k)]_(e)=[C_(k)]_(i)=[C_(k)] for k=6,12. However, wemust be careful to maintain conservation of mass with respect toAiiA-mediated degradation. That is, we must scale the stoichiometricloss of HSL by the ratio of the extracellular and intracellular volumes.To see this, we write out equations for the new variables [C_(k)] interms of its constituent compartments.

$\begin{matrix}{\frac{d\left( {V_{i}.\lbrack A\rbrack} \right)}{dt} = {{V_{i}.a_{A}.{f_{Pbad}({Ara})}} - {V_{i}.d_{A}.\lbrack A\rbrack}}} & \left( {27a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left( {V.\left\lbrack C_{6} \right\rbrack} \right)}{dt} = {{\frac{d\left( {V_{i}.\left\lbrack C_{6} \right\rbrack_{i}} \right)}{dt} + \frac{d\left( {V_{e}.\left\lbrack C_{6} \right\rbrack_{e}} \right)}{dt}} = {{- {V_{i}.d_{A6}.\frac{\lbrack A\rbrack\left\lbrack C_{6} \right\rbrack}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack} + {K_{A12}\left\lbrack C_{12} \right\rbrack}}}} - {V_{i}.d_{C}.\left\lbrack C_{6} \right\rbrack}}}} & \left( {27b} \right)\end{matrix}$ $\begin{matrix}{\frac{\left. \left. {d\left( {V.C_{12}} \right.} \right\rbrack \right)}{dt} = {{\frac{d\left( {V_{i}.\left\lbrack C_{12} \right\rbrack_{i}} \right)}{dt} + \frac{d\left( {V_{e}.\left\lbrack C_{12} \right\rbrack_{e}} \right)}{dt}} = {{- {V_{i}.d_{A12}.\frac{\lbrack A\rbrack\left\lbrack C_{12} \right\rbrack}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack} + {K_{A12}\left\lbrack C_{12} \right\rbrack}}}} - {V_{i}.d_{C}.\left\lbrack C_{12} \right\rbrack}}}} & \left( {27c} \right)\end{matrix}$

By dividing out the volumes from the derivatives on the left-hand sides,and assuming that V(the total volume) is constant and d_(C) isnegligible, we obtain

$\begin{matrix}{\frac{d\lbrack A\rbrack}{dt} = {{a_{A}.{f_{Pbad}({Ara})}} - {\left( {d_{A} + \gamma} \right).\lbrack A\rbrack}}} & \left( {28a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left\lbrack C_{6} \right\rbrack}{dt} = {- {V_{i}.d_{A6}^{\prime}.\frac{\lbrack A\rbrack\left\lbrack C_{6} \right\rbrack}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack} + {K_{A12}\left\lbrack C_{12} \right\rbrack}}}}} & \left( {28b} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left\lbrack C_{12} \right\rbrack}{dt} = {- {V_{i}.d_{A12}^{\prime}.\frac{\lbrack A\rbrack\left\lbrack C_{12} \right\rbrack}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack} + {K_{A12}\left\lbrack C_{12} \right\rbrack}}}}} & \left( {28c} \right)\end{matrix}$

where γ is the dilution factor as before and d′_(Aj)=d_(Aj)/V for eachj.

As AiiA concentrations are never observed, we can arbitrarily rescale[A], reducing the parametric complexity of the model. By substituting[A]=k_(A)·α, we obtain

$\begin{matrix}{\frac{da}{dt} = {{f_{OLux}\left( {\left\lbrack C_{6} \right\rbrack,\left\lbrack C_{12} \right\rbrack} \right)} - {\left( {d_{A} + \gamma} \right)a}}} & \left( {29a} \right)\end{matrix}$ $\begin{matrix}{\frac{d\left\lbrack C_{6} \right\rbrack}{dt} = {- {V_{i}.{\hat{d}}_{6}.\frac{a.\left\lbrack C_{6} \right\rbrack}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack} + {K_{A12}\left\lbrack C_{12} \right\rbrack}}}}} & \left( {29b} \right)\end{matrix}$ $\begin{matrix}{{\frac{d\left\lbrack C_{12} \right\rbrack}{dt} = {- {V_{i}.{\hat{d}}_{12}.\frac{a.\left\lbrack C_{12} \right\rbrack}{1 + {K_{A6}\left\lbrack C_{6} \right\rbrack} + {K_{A12}\left\lbrack C_{12} \right\rbrack}}}}}{where}{{\hat{d}}_{j} = {d_{j}^{\prime}.a_{Aj}.}}} & \left( {29c} \right)\end{matrix}$

Items

The invention may further relate to the following items:

A method for determining one or more intrinsic properties of a DNAcomponent, from a plurality of measurements obtained over a time periodfrom a cell culture, with each cell comprising the DNA component and areference promoter, wherein the DNA component is involved intranscription of one or more target signals and the reference promoterinitiates transcription of a reference signal, wherein the plurality ofmeasurements comprises: (i) measurements relating to the density of thecell culture over the time period; (ii) measurements relating to theamount of the reference signal in the cell culture over the time period;and (iii) measurements relating to the amount of the one or more targetsignals in the cell culture over the time period, the method comprising:

-   -   (a) estimating parameter values for a first mathematical model        that describes cell culture density over time, by minimizing a        measure of the difference between the first mathematical model        output and the measurements relating to the density of the cell        culture over the time period;    -   (b) estimating parameter values for a second mathematical model        that describes the capacity of the cell culture to produce the        reference signal over time, by minimizing a measure of the        difference between the second mathematical model output and the        measurements relating to the amount of the reference signal in        the cell culture over the time period, wherein the second        mathematical model additionally uses the parameter values        estimated in (a);    -   (c) estimating parameters quantifying the intrinsic properties        of the DNA component, by embedding these parameters within a        third mathematical model that describes the production of the        one or more target signals over time, and by minimizing a        measure of the difference between the model outputs and the        measurements relating to the amount of the one or more target        signals, wherein the third mathematical model additionally uses        the parameter values estimated in (b) by taking the second        mathematical model to provide an estimate of the capacity of the        cell culture to produce the one or more target signals over        time.

The method as described above, wherein the DNA component is a promoteror an enhancer.

The method as described above, wherein the intrinsic property is theability to recruit a polymerase enzyme or the ability to bindtranscription factors.

The method as described above, wherein the sequences encoding thereference signal and the one or more target signals are within thechromosome of the cells in the cell culture.

The method as described above, wherein the cell culture is a culture ofbacterial, yeast or mammalian cells.

The method as described above, wherein the one or more target signalsare one or more target proteins.

The method as immediately described above, wherein the one or moretarget proteins are fluorescent proteins, wherein the reference signalis a different fluorescent protein, and wherein the measurementsrelating to the amount of the reference signal in the cell culture overthe time period and the measurements relating to the amount of the oneor more target signals in the cell culture over the time period arefluorescence measurements.

The method as described above, wherein the measurements relating to thedensity of the cell culture over the time period are optical densities.

The method as described above, wherein (a) to (c) are performedsimultaneously or sequentially.

The method as described above, wherein parameters for the firstmathematical model are selected from per capita culture growth rate,carrying capacity, and initial cell density.

The method as described above, wherein the first mathematical model is alogistic growth model.

The method as described above, wherein in (a) minimizing a measure ofthe difference is minimizing a sum of squared errors or the absolutedifferences between the first mathematical model and the measurementsrelating to the density of the cell culture over the time period.

The method as described above, wherein minimizing a measure of thedifference in (a) uses a Nelder-Mead simplex algorithm.

The method as described above, wherein the second mathematical modelmodels the capacity of the cell culture to produce the reference signalover the time period as a chemical reaction network.

The method as described above, wherein minimizing a measure of thedifference in (b) uses a Markov chain Monte Carlo method.

The method as described above, wherein the method is performed withmeasurements (i), (ii) and (iii) obtained from a plurality of separatecell cultures, wherein the plurality of separate cell cultures have beensubjected to different culture conditions, wherein (a), (b) and (c) areperformed for each separate cell culture, wherein in (b) variations inthe capacity to produce the reference signal over time between theplurality of cell cultures are quantified and used in (c) to factor outequivalent variations in the functioning of the DNA component and theone or more target signals.

The method as described above, further comprising obtaining themeasurements of (i), (ii) and (iii).

The method as described above, further comprising a step of adapting theDNA component in vitro based on the results of step (c).

The invention further provides a method of optimizing expression of atleast one gene comprised in a genetic circuit, wherein the geneticcircuit further comprises a DNA component which is involved intranscription of the at least one gene, wherein the method comprises:(1) determining one or more intrinsic properties of the DNA componentusing the method as described above; (2) using the one or more intrinsicproperties of the DNA component determined in (1) to simulate expressionof the at least one gene from the genetic circuit in at least twodifferent arrangements of the genetic circuit; (3) selecting thearrangement in (2) that results in optimal expression of the at leastone gene; and (4) making the arrangement of the genetic circuit selectedin step (3).

The invention further provides a computer program product embodied on acomputer readable storage and comprising code which is configured so asto perform the operations of the method described above when run on acomputer system.

Other variants may become apparent to a person skilled in the art oncegiven the disclosure herein. The scope of the present disclosure is notlimited by the above-described embodiments but only by the accompanyingclaims.

1. A method for determining one or more intrinsic properties of a DNAcomponent, from a plurality of measurements obtained over a time periodfrom a cell culture, with each cell comprising the DNA component,wherein the DNA component is involved in transcription of one or moretarget signals, wherein the plurality of measurements comprisesmeasurements relating to the density of the cell culture over the timeperiod and measurements relating to the amount of the one or more targetsignals in the cell culture over the time period, the method comprising:(a) estimating parameter values for a first mathematical model thatdescribes cell culture density over time, by minimizing a measure of thedifference between the first mathematical model output and themeasurements relating to the density of the cell culture over the timeperiod; (b) estimating parameters quantifying the intrinsic propertiesof the DNA component, by embedding these parameters within a furthermathematical model that describes the production of the one or moretarget signals over time, and by minimizing a measure of the differencebetween the model outputs and the measurements relating to the amount ofthe one or more target signals, wherein the further mathematical modeladditionally uses the parameter values estimated in (a) or parametervalues based thereon.